3 Ways to Optimize for Paging in MySQL
Join the DZone community and get the full member experience.Join For Free
Lots and lots of web applications need to page through information. From customer records, to the albums in your itunes collection. So as web developers and architects, it’s important that we do all this efficiently.
Start by looking at how you’re fetching information from your MySQL database. We’ve outlined three ways to do just that.
1. Paging without discarding records
Ultimately we’re trying to avoid discarding records. After all if the server doesn’t fetch them, we save big. How else can we avoid this extra work.
How about remember the last name. For example:
select id, name, address, phone FROM customers WHERE id > 990 ORDER BY id LIMIT 10;
Of course such a solution would only work if you were paging by ID. If you page by name, it might get messier as there may be more than one person with the same name. If ID doesn’t work for your application, perhaps returning paged users by USERNAME might work. Those would be unique:
SELECT id, username FROM customers WHERE username > 'firstname.lastname@example.org' ORDER BY username LIMIT 10;
Paging queries can be slow with SQL as they often involve the OFFSET keyword which instructs the server you only want a subset. However it typically scans collects and then discards those rows first. With deferred join or by maintaining a place or position column you can avoid this, and speedup your database dramatically.
2. Try using a Deferred Join
This is an interesting trick. Suppose you have pages of customers. Each page displays ten customers. The query will use LIMIT to get ten records, and OFFSET to skip all the previous page results. When you get to the 100th page, it’s doing LIMIT 10 OFFSET 990. So the server has to go and read all those records, then discard them.
SELECT id, name, address, phone FROM customers ORDER BY name LIMIT 10 OFFSET 990;
MySQL is first scanning an index then retrieving rows in the table by primary key id. So it’s doing double lookups and so forth. Turns out you can make this faster with a tricky thing called a deferred join.
The inside piece just uses the primary key. An explain plan shows us “using index” which we love!
SELECT id FROM customers ORDER BY name LIMIT 10 OFFSET 990;
Now combine this using an INNER JOIN to get the ten rows and data you want:
SELECT id, name, address, phone FROM customers INNER JOIN ( SELECT id FROM customers ORDER BY name LIMIT 10 OFFSET 990) AS my_results USING(id);
That’s pretty cool!
3. Maintain a Page or Place column
Another way to trick the optimizer from retrieving rows it doesn’t need is to maintain a column for the page, place or position. Yes you need to update that column whenever you (a) INSERT a row (b) DELETE a row ( c) move a row with UPDATE. This could get messy with page, but a straight place or position might work easier.
SELECT id, name, address, phone FROM customers WHERE page = 100 ORDER BY name;
Or with place column something like this:
SELECT id, name, address, phone FROM customers WHERE place BETWEEN 990 AND 999 ORDER BY name;
Published at DZone with permission of Sean Hull. See the original article here.
Opinions expressed by DZone contributors are their own.
Competing Consumers With Spring Boot and Hazelcast
Microservices With Apache Camel and Quarkus (Part 2)
Health Check Response Format for HTTP APIs
Writing a Vector Database in a Week in Rust