Platinum Partner
news,architects,nosql,architecture,mongodb,application performance management

How I Identified a MongoDB Performance Anti Pattern in 5 Minutes

The other day I was looking at a web application that was using MongoDB as its central database. We were analyzing the application for potential performance problems and inside 5 minutes I detected what I must consider to be a MongoDB anti pattern and had a 40% impact on response time. The funny thing: It was a Java best practice that triggered it!

Analyzing the Application

The first thing I always do is look at the topology of an application to get a feel for it.

Overall Transaction Flow of the Application

Overall Transaction Flow of the Application

As we see it’s a modestly complex web application and it’s using MongoDB as its datastore. Overall MongoDB contributes about 7% to the response time of the application. I noticed that about half of all transactions are actually calling MongoDB so I took a closer look.



Flow of Transactions that access MongoDB, showing 10% response time contribution of MongoDB

Those transactions that actually do call MongoDB spend about 10% of their response time in that popular document database. As a next step I wanted to know what was being executed against MongoDB.


Overview of all MongoDB commands. This shows that the JourneyCollection find and getCount contribute the most to response time

One immediately notices the first two lines, which contribute much more to the response time per transaction than all the others. What was interesting was that thegetCount on the JourneyCollection had the highest contribution time, but the developer responsible was not aware that he was even using it anywhere.

Things get interesting – the mysterious getCount call

Taking things one level deeper, we looked at all transactions that were executing the ominous getCount on the JourneyCollection.


Transactions that call JourneyCollection.getCount spend nearly half their time in MongoDB

What jumps out is that those particular transactions spend indeed over 40% of their time in MongoDB, so there was a big potential for improvement here. Another click and we looked at all MongoDB calls that were executed within the context of the same transaction as the getCount call we found so mysterious.


All MongoDB Statements that run within the same transaction context as the JourneyCollection.getCount

What struck us as interesting was that the number of executions per transaction of thefind and getCount on the JourneyCollection seemed closely connected. At this point we decided to look at the transactions themselves – we needed to understand why that particular MongoDB call was executed.


Single Transactions that execute the ominous getCount call

It’s immediately clear that several different transaction types are executing that particular getCount. What that meant for us is that the problem was likely in the core framework of that particular application rather than being specific to any one user action. Here is the interesting snippet:


The Transaction Trace shows where the getCount is executed exactly

We see that the WebService findJourneys spends all its time in the two MongoDB calls. The first is the actual find call to the Journey Collection. The MongoDB client is good at lazy loading, so the find does not actually do much yet. It only calls the server once we access the result set. We can see the round trip to MongoDB visualized in thecall node at the end.

We also see the offending getCount. We see that it is executed by a method calledsize which turns out to be com.mongodb.DBCursor.size method. This was news to our developer. Looking at several other transactions we found that this was a common pattern. Every time we search for something in the JourneyCollection thegetCount would be executed by com.mongodb.DBCursor.size. This always happens before we would really execute the send the find  command to the server(which happens in the call method). So we used CompuwareAPM DTM’s (a.k.a dynaTrace) developer integration and took a look at the offending code. Here is what we found:

BasicDBObject fields = new BasicDBObject(); fields.put(journeyStr + "." + MongoConstants.ID, 1); fields.put(MongoConstants.ID, 0); Collection locations = find(patternQuery, fields);
ArrayList results = new ArrayList(locations.size());
for (DBObject dbObject : locations) {
  String loc = dbObject.getString(journeyStr);
  results.add(loc);
}
return results;

The code looks harmless enough; we execute a find, create an array for the result and fill it. The offender is the location.size(). MongoDBs DBCursor is similar to theResultSet in JDBC, it does not return the whole data set at once, but only a subset. As a consequence it doesn’t really know how many elements the find will end up with. The only way for MongoDB to determine the final size seems to be to execute agetCount with the same criteria as the original find. In our case that additional unnecessary roundtrip made up 40% of the web services response time!

An Anti-Patter triggered by a Best Practice

So it turns out that calling size on the DBCursor must be considered an anti-pattern! The real funny thing is that the developer thought he was writing performant code. He was following the best practice to pre-size arrays. This avoids any unnecessary re-sizing ;-) . In this particular case however, that minor theoretical performance improvement led to a 40% performance degradation!

Conclusion

The take away here is not that MongoDB is bad or doesn’t perform. In fact the customer is rather happy with it. But mistakes happen and similar to other database applications we need to have the visibility into a running application to see how much it contributes to the overall response time. We also need to have that visibility to understand which statements are called where and why.

In addition this also demonstrates nicely why premature micro optimization, without leveraging an APM solution, in production will not lead to better performance. In some cases – like this one – it can actually lead to worse performance!

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}