DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. How I Identified a MongoDB Performance Anti Pattern in 5 Minutes

How I Identified a MongoDB Performance Anti Pattern in 5 Minutes

Michael Kopp user avatar by
Michael Kopp
·
Feb. 06, 13 · Interview
Like (0)
Save
Tweet
Share
9.23K Views

Join the DZone community and get the full member experience.

Join For Free

The other day I was looking at a web application that was using MongoDB as its central database. We were analyzing the application for potential performance problems and inside 5 minutes I detected what I must consider to be a MongoDB anti pattern and had a 40% impact on response time. The funny thing: It was a Java best practice that triggered it!

Analyzing the Application

The first thing I always do is look at the topology of an application to get a feel for it.

Overall Transaction Flow of the Application

Overall Transaction Flow of the Application

As we see it’s a modestly complex web application and it’s using MongoDB as its datastore. Overall MongoDB contributes about 7% to the response time of the application. I noticed that about half of all transactions are actually calling MongoDB so I took a closer look.



Flow of Transactions that access MongoDB, showing 10% response time contribution of MongoDB

Those transactions that actually do call MongoDB spend about 10% of their response time in that popular document database. As a next step I wanted to know what was being executed against MongoDB.


Overview of all MongoDB commands. This shows that the JourneyCollection find and getCount contribute the most to response time

One immediately notices the first two lines, which contribute much more to the response time per transaction than all the others. What was interesting was that thegetCount on the JourneyCollection had the highest contribution time, but the developer responsible was not aware that he was even using it anywhere.

Things get interesting – the mysterious getCount call

Taking things one level deeper, we looked at all transactions that were executing the ominous getCount on the JourneyCollection.


Transactions that call JourneyCollection.getCount spend nearly half their time in MongoDB

What jumps out is that those particular transactions spend indeed over 40% of their time in MongoDB, so there was a big potential for improvement here. Another click and we looked at all MongoDB calls that were executed within the context of the same transaction as the getCount call we found so mysterious.


All MongoDB Statements that run within the same transaction context as the JourneyCollection.getCount

What struck us as interesting was that the number of executions per transaction of thefind and getCount on the JourneyCollection seemed closely connected. At this point we decided to look at the transactions themselves – we needed to understand why that particular MongoDB call was executed.


Single Transactions that execute the ominous getCount call

It’s immediately clear that several different transaction types are executing that particular getCount. What that meant for us is that the problem was likely in the core framework of that particular application rather than being specific to any one user action. Here is the interesting snippet:


The Transaction Trace shows where the getCount is executed exactly

We see that the WebService findJourneys spends all its time in the two MongoDB calls. The first is the actual find call to the Journey Collection. The MongoDB client is good at lazy loading, so the find does not actually do much yet. It only calls the server once we access the result set. We can see the round trip to MongoDB visualized in thecall node at the end.

We also see the offending getCount. We see that it is executed by a method calledsize which turns out to be com.mongodb.DBCursor.size method. This was news to our developer. Looking at several other transactions we found that this was a common pattern. Every time we search for something in the JourneyCollection thegetCount would be executed by com.mongodb.DBCursor.size. This always happens before we would really execute the send the find  command to the server(which happens in the call method). So we used CompuwareAPM DTM’s (a.k.a dynaTrace) developer integration and took a look at the offending code. Here is what we found:

BasicDBObject fields = new BasicDBObject(); fields.put(journeyStr + "." + MongoConstants.ID, 1); fields.put(MongoConstants.ID, 0); Collection locations = find(patternQuery, fields);
ArrayList results = new ArrayList(locations.size());
for (DBObject dbObject : locations) {
  String loc = dbObject.getString(journeyStr);
  results.add(loc);
}
return results;

The code looks harmless enough; we execute a find, create an array for the result and fill it. The offender is the location.size(). MongoDBs DBCursor is similar to theResultSet in JDBC, it does not return the whole data set at once, but only a subset. As a consequence it doesn’t really know how many elements the find will end up with. The only way for MongoDB to determine the final size seems to be to execute agetCount with the same criteria as the original find. In our case that additional unnecessary roundtrip made up 40% of the web services response time!

An Anti-Patter triggered by a Best Practice

So it turns out that calling size on the DBCursor must be considered an anti-pattern! The real funny thing is that the developer thought he was writing performant code. He was following the best practice to pre-size arrays. This avoids any unnecessary re-sizing ;-) . In this particular case however, that minor theoretical performance improvement led to a 40% performance degradation!

Conclusion

The take away here is not that MongoDB is bad or doesn’t perform. In fact the customer is rather happy with it. But mistakes happen and similar to other database applications we need to have the visibility into a running application to see how much it contributes to the overall response time. We also need to have that visibility to understand which statements are called where and why.

In addition this also demonstrates nicely why premature micro optimization, without leveraging an APM solution, in production will not lead to better performance. In some cases – like this one – it can actually lead to worse performance!

MongoDB Anti-pattern application Best practice

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top Authentication Trends to Watch Out for in 2023
  • When AI Strengthens Good Old Chatbots: A Brief History of Conversational AI
  • Handling Automatic ID Generation in PostgreSQL With Node.js and Sequelize
  • Artificial Intelligence in Drug Discovery

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: