DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. NuGet Performance Problems: Part I

NuGet Performance Problems: Part I

Oren Eini user avatar by
Oren Eini
·
Aug. 29, 12 · Interview
Like (0)
Save
Tweet
Share
3.00K Views

Join the DZone community and get the full member experience.

Join For Free

It appears that NuGet has some performance problems and Jeff Handley posted the problematic queries as well as the new, hand-optimized queries.

You can see the original problematic queries and the optimized code (still tentative) here.

Here is the hand-optimized query to load the pages for the packages page:

SELECT        TOP (30)
            Paged.PackageRegistrationKey
        ,    Paged.Id
        ,    Paged.Version
        ,    Packages.FlattenedAuthors
        ,    Packages.Copyright
        ,    Packages.Created
        ,    Packages.FlattenedDependencies
        ,    Packages.Description
        ,    PackageRegistrations.DownloadCount
        ,    Packages.ExternalPackageUrl
        ,    N'packages/' + PackageRegistrations.Id + N'/' + Packages.Version AS C1
        ,    Packages.IconUrl
        ,    Packages.IsLatestStable
        ,    Packages.Language
        ,    Packages.LastUpdated
        ,    Packages.LicenseUrl
        ,    Packages.Hash
        ,    Packages.HashAlgorithm
        ,    Packages.PackageFileSize
        ,    Packages.ProjectUrl
        ,    CASE Packages.Listed WHEN 1 THEN Packages.Published ELSE NULL END AS C2
        ,    Packages.ReleaseNotes
        ,    N'package/ReportAbuse/' + PackageRegistrations.Id + N'/' + Packages.Version AS C3
        ,    Packages.RequiresLicenseAcceptance
        ,    Packages.Summary
        ,    CASE WHEN Packages.Tags IS NULL THEN CAST(NULL as varchar(1)) ELSE N' ' + LTRIM(RTRIM(Packages.Tags)) + N' ' END AS C4
        ,    ISNULL(Packages.Title, PackageRegistrations.Id) AS C5
        ,    Packages.DownloadCount AS DownloadCount1
        ,    cast(0 as float(53)) AS C6
FROM        (

            SELECT        Filtered.Id
                    ,    Filtered.PackageRegistrationKey
                    ,    Filtered.Version
                    ,    Filtered.DownloadCount
                    ,    row_number() OVER (ORDER BY Filtered.DownloadCount DESC, Filtered.Id ASC) AS [row_number]
            FROM        (
                        SELECT        PackageRegistrations.Id
                                ,    Packages.PackageRegistrationKey
                                ,    Packages.Version
                                ,    PackageRegistrations.DownloadCount
                        FROM        Packages
                        INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Packages.PackageRegistrationKey
                        WHERE        Packages.IsPrerelease <> cast(1 as bit)
                        ) Filtered
            ) Paged
INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Paged.PackageRegistrationKey
INNER JOIN    Packages ON Packages.PackageRegistrationKey = Paged.PackageRegistrationKey AND Packages.Version = Paged.Version
WHERE        Paged.[row_number] > 30
ORDER BY    PackageRegistrations.DownloadCount DESC
        ,    Paged.Id

This monster query is actually translated to something like:

Give me the top 30 packages which are not pre released, ordered by the download count and then by their id.


It takes a great deal of complexity to deal with that for one major reason, the data is split up across multiple tables in a way that makes it hard get all of it easily. The minor reason is that there is really no good way to do paging in SQL Server (shocking, I know). One would assume that such a basic feature would have a bit more attention.

What is worse is the optimized version of the search feature:

SELECT        TOP (30)
            Paged.PackageRegistrationKey
        ,    Paged.Id
        ,    Paged.Version
        ,    Packages.FlattenedAuthors
        ,    Packages.Copyright
        ,    Packages.Created
        ,    Packages.FlattenedDependencies
        ,    Packages.Description
        ,    PackageRegistrations.DownloadCount
        ,    Packages.ExternalPackageUrl
        ,    N'packages/' + PackageRegistrations.Id + N'/' + Packages.Version AS C1
        ,    Packages.IconUrl
        ,    Packages.IsLatestStable
        ,    Packages.Language
        ,    Packages.LastUpdated
        ,    Packages.LicenseUrl
        ,    Packages.Hash
        ,    Packages.HashAlgorithm
        ,    Packages.PackageFileSize
        ,    Packages.ProjectUrl
        ,    CASE Packages.Listed WHEN 1 THEN Packages.Published ELSE NULL END AS C2
        ,    Packages.ReleaseNotes
        ,    N'package/ReportAbuse/' + PackageRegistrations.Id + N'/' + Packages.Version AS C3
        ,    Packages.RequiresLicenseAcceptance
        ,    Packages.Summary
        ,    CASE WHEN Packages.Tags IS NULL THEN CAST(NULL as varchar(1)) ELSE N' ' + LTRIM(RTRIM(Packages.Tags)) + N' ' END AS C4
        ,    ISNULL(Packages.Title, PackageRegistrations.Id) AS C5
        ,    Packages.DownloadCount AS DownloadCount1
        ,    cast(0 as float(53)) AS C6
FROM        (

            SELECT        Filtered.Id
                    ,    Filtered.PackageRegistrationKey
                    ,    Filtered.Version
                    ,    Filtered.DownloadCount
                    ,    row_number() OVER (ORDER BY Filtered.DownloadCount DESC, Filtered.Id ASC) AS [row_number]
            FROM        (
                        SELECT        PackageRegistrations.Id
                                ,    Packages.PackageRegistrationKey
                                ,    Packages.Version
                                ,    PackageRegistrations.DownloadCount
                        FROM        Packages
                        INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Packages.PackageRegistrationKey
                        WHERE        ((((Packages.IsPrerelease <> cast(1 as bit)))))
                                ((((AND    Packages.IsLatestStable = 1))))
                                ((((AND    Packages.IsLatest = 1))))
                                AND    (
                                        PackageRegistrations.Id LIKE '%jquery%' ESCAPE N'~'
                                    OR    PackageRegistrations.Id LIKE '%ui%' ESCAPE N'~'

                                    OR    Packages.Title LIKE '%jquery%' ESCAPE N'~'
                                    OR    Packages.Title LIKE '%ui%' ESCAPE N'~'

                                    OR    Packages.Tags LIKE '%jquery%' ESCAPE N'~'
                                    OR    Packages.Tags LIKE '%ui%' ESCAPE N'~'
                                    )
                        ) Filtered
            ) Paged
INNER JOIN    PackageRegistrations ON PackageRegistrations.[Key] = Paged.PackageRegistrationKey
INNER JOIN    Packages ON Packages.PackageRegistrationKey = Paged.PackageRegistrationKey AND Packages.Version = Paged.Version
WHERE        Paged.[row_number] > 30
ORDER BY    PackageRegistrations.DownloadCount DESC
        ,    Paged.Id

One thing that immediately popped up to me was the use of queries such as "’%jquery%’. This is a flat out killer for performance in relational databases, since they cannot do any indexes on this and are forced to do a table scan.

I decided to take a stab at moving the NuGet data to RavenDB, which is a much better fit (in my own obviously utterly unbiased opinion). In the next post, we will start with the actual import process, then we get to actual queries.

NuGet Relational database

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Too Many Tools? Streamline Your Stack With AIOps
  • The New DevOps Performance Clusters
  • What Should You Know About Graph Database’s Scalability?
  • How to Rescue Your Magento 2 Project

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: