Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

MongoDB Troubleshooting: My Top 5

DZone's Guide to

MongoDB Troubleshooting: My Top 5

From election monitoring to quick and easy greps, here is one DBA's five go-to tips for MongoDB troubleshooting.

· Database Zone
Free Resource

Traditional relational databases weren’t designed for today’s customers. Learn about the world’s first NoSQL Engagement Database purpose-built for the new era of customer experience.

Every DBA has a war chest of their go-to solutions for any support issues they run into for a specific technology. MongoDB is no different. Even if you have picked it because it’s a good fit and it runs well for you, things will change. When things change — sometimes there is a new version of your application, or a new version of the database itself — you need to have a solid starting place.

To help new DBAs, I like to point out my top five things that cover the bulk of requests a DBA might need to work on.

1. Common Greps to Use

This issue is all about what are some ways to pair down the error log and make it a bit more manageable. The error log is a slew of information and sometimes, without grep, it’s challenging to correlate some events.

Is an Index Being Built?

As a DBA, you will often get a call saying the database has “stopped.” The developer might say, “I didn’t change anything.” Looking at the error log is a great first port of call. With this particular grep, you just want to see if all index builds were done, if a new index was built and is still building, or an index was removed. This will help catch all of the cases in question.

What’s Happening Right Now?

Like with the above index example, this helps you remove many of the messages you might not care about — or those you want to block off. MongoDB does have some useful sub-component tags in the logs, such as “ReplicationExecutor” and “connXXX” that can be helpful, but I find it helpful to remove the noisy lines, as opposed to the log facility types. In this example, I opted to also not have “| grep -v connection” — typically, I will look at the log with connections first to see if they are acting funny, and filter those out to see the core data of what is happening. If you only want to see the long queries and command, replace “ms” with “connection” to make them easier to find.

2. Did Any Elections Happen? Why Did They Happen?

While this isn’t the most common command to run, it is very helpful if you aren’t using Percona Monitoring and Management (PMM) to track the historical frequency of elections. In this example, we want up to 20 lines before and after the word “SECONDARY”, which typically guards when a step-down or election takes place. Then, you can see if, around that time, a command was issued, did a network error occur, was there a heartbeat failure, or other such scenario?

3. Is Replication Lagged? Do I Have Enough Oplog?

Always write a single test document just to ensure replication has a recent write:

Checking Lag Information

Oplog Size and Range

4. Taming the Profiler

MongoDB is filled with tons of data in the profiler. I have highlighted some key points to know:

Metric Description
Filter Formulated query that was run. Right above it you can find the parsed query. These should be the same. It’s useful to know what the engine was sent in the end.
nReturned Number of documents to return via the cursor to the client running the query/command.
executionTimeMillis This used just to be called “ms”, but it means how long did this operation take. Typically you would measure this like a slow query in any database.
total(Keys|Docs)Examined Unlike returned, this is what might be considered since not all indexes have perfect coverage, and sometimes you scan many documents to find no results.
stage While poorly named, this will tell you if a collection scan (table scan) or index is used to answer a given operation. In the case of an index, it will say the name.

5. CurrentOp and killOp Explained

When using db.CurrentOp() to see what is running, I frequently include db.currentOp(true) so that I can see everything and not just limited items. This makes the currentOp function look and act much more like SELECT * from information_schema.processlist in MySQL. One significant difference that commonly catches a new DBA off-guard is the killing of operations between MySQL and MongoDB. While Mongo does have a handy db.killOp(<op_id>) function, it is important to know that, unlike MySQL — which immediately kills the thread running the process — MongoDB is a bit different.

When you run killOp(), MongoDB appends “killed: true” into the document structure. When the next yield occurs (if it occurs), it will tell the operation to quit. This is also how a shutdown works: If it seems like it’s not shutting down, it might be waiting for an operation to yield and notice the shutdown request.

I’m not arguing that this is bad or good, just different from MySQL and something of which you should be aware. One thing to note, however, is that MongoDB has great built in HA. Sometimes, it is better to cause an election and let the drivers gracefully handle things, rather than running the killOp command (unless it’s a write, then you should always try and use killOp).

Conclusion

I hope you have found some of this insightful. Look for future posts from the MongoDB team around other MongoDB areas we like to look at (or in different parts of the system) to help ourselves and clients get to the root of an issue.

Learn how the world’s first NoSQL Engagement Database delivers unparalleled performance at any scale for customer experience innovation that never ends.

Topics:
mongodb ,troubleshooting ,database ,grep

Published at DZone with permission of David Murphy, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}