DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Why Database Migrations Take Months and How to Speed Them Up
  • Unmasking Entity-Based Data Masking: Best Practices 2025
  • How Trustworthy Is Big Data?
  • Fixing Common Oracle Database Problems

Trending

  • Performing and Managing Incremental Backups Using pg_basebackup in PostgreSQL 17
  • MySQL to PostgreSQL Database Migration: A Practical Case Study
  • Integration Isn’t a Task — It’s an Architectural Discipline
  • Why High-Performance AI/ML Is Essential in Modern Cybersecurity
  1. DZone
  2. Data Engineering
  3. Databases
  4. A Tale of Slow Pagination

A Tale of Slow Pagination

Don’t neglect your pagination queries when speeding up your app and save yourself a few "Ugh why is this so slow?" per day.

By 
Richard Schneeman user avatar
Richard Schneeman
·
Jun. 23, 17 · Opinion
Likes (1)
Comment
Save
Tweet
Share
5.5K Views

Join the DZone community and get the full member experience.

Join For Free

When I see a query in my logs without either a limit or a count clause, alarm bells go off because it is likely a hotspot. A pagination query has a limit so it usually flies under my radar:

SELECT  "repos".*
FROM "repos"
WHERE (issues_count > 0)
ORDER BY issues_count DESC
LIMIT $1 OFFSET $2

This query came from the main page of my app CodeTriage.com.

As you can guess there’s a problem here. It was pointed out to me by Nate Berkopec, who writes a ton about Ruby performance problems. I had been staring the issue in the face for so long, I just accepted that slow query as a fact of life.

I’ll take a moment to rewind and show you how I should have found the problem.

Always Benchmark Code

If you’re not using Rack Mini Profiler in development, you really should be. Even better, you should be using it in production:

  • Check this out
  • Check this out too

These two commits tell RMP that when someone is logged in and that person has an admin flag on their account to show them the rack-mini-profiler stats in the UI.

Here’s what my index page looks like when I’m logged in.

When I click the tab in the top left, it expands and this is what I see:

Of the 336ms page render time (on the server side), you can see that 281.6ms are being spent in:

Rendering: pages/_repos_with_pagination

You can also see that there’s a link for 2 SQL queries. When you click those, this is what you see:

One SQL query takes 125ms and the other 118ms. Yikes!

Most of my other SQL queries take anywhere from 1-4ms so this is pretty bad. Over 70% of my time spent rendering my index page is spent in these two queries.

If you ever have a query that you don’t know why it’s slow, you can use EXPLAIN ANALYZE with Postgres:

$ heroku pg:psql
issuetriage::DATABASE=> EXPLAIN ANALYZE
SELECT COUNT(*)
FROM "repos"
WHERE (issues_count > 0);
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate  (cost=39294.58..39294.58 rows=1 width=0) (actual time=115.217..115.217 rows=1 loops=1)
   ->  Seq Scan on repos  (cost=0.00..39293.48 rows=2198 width=0) (actual time=0.009..114.978 rows=2184 loops=1)
         Filter: (issues_count > 0)
         Rows Removed by Filter: 511
Total runtime: 115.253 ms
(5 rows)

This confirms the total time is pretty bad at 115ms. I’m doing this in production so that I have production like data.

In this case, I’m doing a simple count, so why are things so slow? When you look at the output, this line should jump out at you:

->  Seq Scan on repos  (cost=0.00..39293.48 rows=2198 width=0) (actual time=0.009..114.978 rows=2184 loops=1)

The database has to loop over all records just to find the ones that have an issue count greater than zero.

The next query in our problem output has bind parameters in it:

SELECT
"repos".*
FROM "repos"
WHERE (issues_count > 0)
ORDER BY issues_count DESC
LIMIT $1
OFFSET $2;

We can replace these with some reasonable values, such as a limit of 20 and an offset of 0.

issuetriage::DATABASE=> EXPLAIN ANALYZE
SELECT  "repos".* FROM "repos"
WHERE (issues_count > 0)
ORDER BY issues_count DESC
LIMIT 20
OFFSET 0;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Limit  (cost=39305.18..39305.19 rows=20 width=1541) (actual time=150.109..150.113 rows=20 loops=1)
   ->  Sort  (cost=39305.18..39306.28 rows=2198 width=1541) (actual time=150.108..150.111 rows=20 loops=1)
         Sort Key: issues_count
         Sort Method: top-N heapsort  Memory: 65kB
         ->  Seq Scan on repos  (cost=0.00..39293.48 rows=2198 width=1541) (actual time=0.023..148.889 rows=2184 loops=1)
               Filter: (issues_count > 0)
               Rows Removed by Filter: 511
Total runtime: 150.160 ms

Even though we’re using a limit, we have to order all the entries in our database by that value in the order clause before we can apply the limit. This means we have to sort all the values, which is why you’re seeing that sort call. Later you’re seeing the same sequential scan since we’re also applying the same where clause (to not show issues_count-s of 0).

So what’s the fix? In this case, it’s pretty simple. We can give the database more information about the data that it needs to sort and filter by adding an index.

Indexes aren’t magic remedies, there is a downside as they use memory in your database, which is not free. In this case, I only have about 2K repos in the DB so the index is well worth the cost. Alternatively, I could consider caching the view elements to not do those expensive lookups on the first page. I could even change the view to be a simple Next button, and then I could have avoided having to count all the repos.

So there are multiple fixes possible, but in my case adding an index was the easiest. Again thanks to Nate Berkopec for pointing this out. I didn’t figure this all out on my own, he did. He also has a book explaining how to debug and profile common performance issues like this that’s worth picking up.

What’s the end result? We can re-run our EXPLAIN ANALYZE and get some really good feedback:

issuetriage::DATABASE=> EXPLAIN ANALYZE
SELECT  "repos".*
FROM "repos"
WHERE (issues_count > 0)
ORDER BY issues_count DESC
LIMIT 20
OFFSET 0;
                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.06..45.61 rows=20 width=1541) (actual time=0.019..0.050 rows=20 loops=1)
   ->  Index Scan Backward using index_repos_on_issues_count on repos  (cost=0.06..4980.92 rows=2187 width=1541) (actual time=0.017..0.047 rows=20 loops=1)
         Index Cond: (issues_count > 0)
 Total runtime: 0.084 ms
(4 rows)

From over 100ms to under 1ms not bad, what about the other?

issuetriage::DATABASE=> EXPLAIN ANALYZE
SELECT COUNT(*) FROM "repos"
WHERE (issues_count > 0);
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=75.33..75.33 rows=1 width=0) (actual time=1.713..1.713 rows=1 loops=1)
   ->  Index Only Scan using index_repos_on_issues_count on repos  (cost=0.06..74.23 rows=2187 width=0) (actual time=0.034..1.578 rows=2184 loops=1)
         Index Cond: (issues_count > 0)
         Heap Fetches: 618
 Total runtime: 1.757 ms
(5 rows)

Not too shabby, either. The overall page render time went from 336ms to 169ms:

That partial where we are calling these queries still accounts for 55ms, but it’s much better than the previously 281.6ms. They say that a 200ms wait time is perceptible to a human, so in theory, this just saved me a few “ugh why is this so slow?”s per day. Better than that, it also means that my index page can handle slightly more throughput as it won’t be hanging around waiting for slow queries to finish.

Bottom line: Don’t neglect your pagination queries when speeding up your app, and go by Nate’s book.

Database

Published at DZone with permission of Richard Schneeman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Why Database Migrations Take Months and How to Speed Them Up
  • Unmasking Entity-Based Data Masking: Best Practices 2025
  • How Trustworthy Is Big Data?
  • Fixing Common Oracle Database Problems

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: