Over a million developers have joined DZone.
Silver Partner

Common Application Problems and How to Fix Them: The Select N + 1 Problem

· Performance Zone

The Performance Zone is brought to you in partnership with Smartbear. Learn more about how the performance of your API's, web, and mobile applications have a direct impact on your bottom line with the Guide to Synthetic Monitoring.

At AppDynamics, we get the opportunity to see the inner workings of a lot of applications. While these applications all seem pretty different to their end users, what’s under the hood usually doesn’t vary that much (sorry, your app isn’t a unique snowflake after all). They all have similar service-oriented architectures using a variety of databases, caches, and queues. This holds true for the performance issues these apps experience and the anti-patterns that cause them. The same problems show up in high-speed trading applications, e-commerce sites, mobile apps and online games – so we thought we’d put together some of the most common performance problems in a blog series to show you how to find, fix and prevent them. In this blog post we’ll take a look at a pretty common problem that can be tricky to detect in a large application: the Select N + 1 problem.

What is it?

The N + 1 problem is a performance anti-pattern in which an application makes N + 1 database calls (where N is the number of objects fetched). Like most anti-patterns, this isn’t necessarily a problem in itself, but under certain circumstances (where N is large, for example) it will cause performance to degrade by making hundreds or even thousands of database calls for a single business transaction.

In plain English: You’re spamming your database with really small, fast queries instead of using one or two more complex ones.

Here’s how it usually goes down: You have two database tables with a parent/child relationship (like blogs and posts, or products and line items), and you want to iterate through all of them. So you do this:


and then execute a query for each record:

SELECT * FROM Child WHERE parent_id = ?

This isn’t necessarily a bad way of doing it, especially if there aren’t many parents or children. But what if you’re a giant e-commerce company with thousands of products and line items? Suddenly each transaction is calling the database thousands of times. Even if each database call is super fast, the cumulative response time of that transaction will be seconds (if not longer), which is not an ideal situation. The problem only becomes worse with an increase in traffic.

How to Find It with AppDynamics

Like I said, this isn’t always a problem, but when it is it can be hard to find. The database will probably look normal to the DBA – they won’t see any long-running queries, and the CPU may look fine under normal load. The best way to find a problem like this is to use a performance monitoring solution that allows you to drill into a particular business transaction (user request) and see what code is executed and how it is accessing the database.

Here’s an example from an AppDynamics customer:

The best part about AppDynamics is its request snapshots. They not only allow us to troubleshoot performance problems faster – they also allow us to perfect this code. “AppDynamics allows us to see what is going on and identify the issues that could be refactored and made faster,” he said. “It lets us bridge the gap between anecdotes from users and actual, actionable information.” – Cornell University

How to Fix It

This problem happens most often when you’re using a persistence engine or an object/relational mapper (ORM) like Hibernate and you’re using lazy loading. Be sure to understand the defaults for any object-relational mapper before you begin using them.

If you’re just writing raw SQL, you may want to fetch all your data at once and then join the two sets of records, like this:

SELECT p.id, p.name FROM Parent p
LEFT OUTER JOIN child c ON p.id = c.parent_id
INNER JOIN grandchild g ON c.id = g.parent_id

Optimizing Database Access

Many times, how an application accesses the database will be a focus of optimization. Stay tuned for our next blog post in the series, where we will discuss the importance of caching database access. If you enjoyed this blog post, check out our e-book on Java performance problems.

Find out more about AppDynamics Pro and get started optimizing your application with a free 15 day trial.

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

Thoughts? Let us know on Twitter @AppDynamics!

The Performance Zone is brought to you in partnership with Smartbear. Discover how important it is to recognize the inherent power of APIs and bulding the appropriate safeguards to protect it with the Essential Guide to API Monitoring.


Published at DZone with permission of Dustin Whittle , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}