Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

NuGet Perf, Part III–Displaying the Packages page

DZone's Guide to

NuGet Perf, Part III–Displaying the Packages page

· Database Zone
Free Resource

Whether you work in SQL Server Management Studio or Visual Studio, Redgate tools integrate with your existing infrastructure, enabling you to align DevOps for your applications with DevOps for your SQL Server databases. Discover true Database DevOps, brought to you in partnership with Redgate.

The first thing that we will do with RavenDB and the NuGet data is to issue the same logical query as the one used to populate the packages page. As a reminder, here is how it looks:

SELECT        TOP (30) 
          -- ton of fields removed for brevity
FROM        (

            SELECT        Filtered.Id
                    ,    Filtered.PackageRegistrationKey
                    ,    Filtered.Version
                    ,    Filtered.DownloadCount
                    ,    row_number() OVER (ORDER BY Filtered.DownloadCount DESC, Filtered.Id ASC) AS [row_number]
            FROM        (
                        SELECT        PackageRegistrations.Id
                                ,    Packages.PackageRegistrationKey
                                ,    Packages.Version
                                ,    PackageRegistrations.DownloadCount
                        FROM        Packages
                        INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Packages.PackageRegistrationKey
                        WHERE        Packages.IsPrerelease <> cast(1 as bit)
                        ) Filtered
            ) Paged
INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Paged.PackageRegistrationKey
INNER JOIN    Packages ON Packages.PackageRegistrationKey = Paged.PackageRegistrationKey AND Packages.Version = Paged.Version
WHERE        Paged.[row_number] > 30
ORDER BY    PackageRegistrations.DownloadCount DESC
        ,    Paged.Id

Despite the apparent complexity ,this is a really trivial query. What is does is say:

  • Give me the first 30 – 60 rows
  • Where IsPrerelease is false
  • Order by the download count and then the id

With Linq, the client side query looks something like this:

var results = Session.Query<Package>()
                         .Where(x=>x.IsPrerelease == false)
                         .OrderBy(x=>x.DownloadCount).ThenBy(x=>x.Id)
                         .Skip(30)
                         .Take(30)
                         .ToList();

Now, I assume that this is what the NuGet code is also doing, it is just that the relational database has made it so they have to go to the data in a really convoluted way.

With RavenDB, to match the same query, I could just issue the following query, but there are subtle differences between how the query works in SQL and how it works in RavenDB. in particular, the data that we have in RavenDB is the output of this query, but it isn’t the raw output. For example, we don’t have the Id column available, which is used for sorting. Now, I think that the logic is meaning to say, “sort by download count descending and then by age ascending”. So old and popular packages are more visible than new and fresh packages.

In order to match the same behavior (and because we will need it to the next post) we will define the following index in RavenDB:

image

And querying it:

image

The really nice thing about this?

image

This is the URL for this search:

/indexes/Packages/Listing?query=IsPrerelease:false&start=0&pageSize=128&aggregation=None&sort=-DownloadCount&sort=Created

This is something that RavenDB can do in its sleep, because it is a very cheap operation. Consider the query plan that would for the SQL query above. You have to join 5 times just to get to the data that you want, paging is a real mess, and the database actually have to work a lot to answer this fiddling little query.

Just to give you some idea here. We are talking about something that conceptually should be the same as:

select top 30 skip 30 * from Data where IsPrerelease = 0

But it get really complex really fast with the joins and the tables and all the rest.

In comparison, in RavenDB, we actually do have just a property match to do. Because we keep the entire object graph in a single location, we can do very efficient searches on it.

In the next post, I’ll discuss the actual way I modeled the data, and then we get to do exciting searches Smile.

It’s easier than you think to extend DevOps practices to SQL Server with Redgate tools. Discover how to introduce true Database DevOps, brought to you in partnership with Redgate

Topics:

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}