How to Stop a Runaway Index Build in MongoDB

Oh no! You've triggered your index build on a large collection and now your cluster is unresponsive. No matter what you do, you can't stop it. What to do?!

Sep. 22, 17 · Tutorial

Likes (4)

Comment

Save

11.3K Views

Index builds in MongoDB can have an adverse impact on the availability of your MongoDB cluster. If you trigger a foreground index build on a large collection on your production server, you may find that your cluster is unresponsive until the index build is complete. On a large collection, this could take several hours or days, as described in the perils of index building in MongoDB.

The recommended best practice is to trigger index builds in the background. However, on large collection indexes, we've seen multiple problems with this approach. In the case of a three-node cluster, both secondaries start building the index and stop responding to any requests. Consequently, the primary does not have quorum and moves to the secondary state, taking your cluster down. Also, the default index builds triggered from the command line are foreground index builds — making this a now widespread problem. In future releases, we're hopeful that this becomes background by default.

Once you've triggered an index, simply restarting the server does not solve our problem. MongoDB will pick up the index build from where it left off. If you were running a background index build previously after the restart, it now becomes a foreground index build, so in this case, the restart could make the problem worse.

If you've already triggered an index build, how do you stop it? Luckily, it's relatively easy to stop an index build.

Option 1: Kill the Index Build Process

Locate the index build process using db.currentOp() and then kill the operation using db.killOp(<opid>). The index operation will look something like this:

{
"opid" : 820659355,
"active" : true,
"lockType" : "write",
....
"op" : "insert",
"ns" : "xxxx",
"query" : {
},
"client" : "xxxx",
"desc" : "conn",
"msg" : "index: (2/3) btree bottom up 292168587/398486401 64%"
}

If the node where the index is building does not respond to new connections or the killOp does not work, use Option 2 below.

Option 2: Configuring noIndexBuildRetry and Restart

MongoDB provides a -noIndexBuildRetry option, which instructs MongoDB to stop building incomplete indexes on restart.

This parameter doesn't appear to be supported by the config file, only as a parameter for the mongod process. We don't prefer to run mongod manually with this option because if you accidentally run the mongod process as an elevated user (i.e. root), it ends up changing the permissions of all the files. Also, once run as root, we've had intermittent problems running the process as mongod again.

A simpler option is to edit the /etc/init.d/mongod file. Looks for this line:

OPTIONS=" -f $CONFIGFILE"

Replace with this line:

OPTIONS=" -f $CONFIGFILE --noIndexBuildRetry"

Detailed Steps

For the purposes of this discussion, we're providing instructions for CentOS/RedHat/Amazon Linux.

Configure -noIndexBuildRetry. Add the -noIndexBuildRetry option to all your data nodes, as explained above.
Restart all the nodes building the index. Look at the mongod log file for each data server and determine if it's building the index. If it is, then restart the server using service mongod restart.
Drop the incomplete index. Once all of the relevant nodes are restarted, look at the list of indexes and drop the incomplete index if you see it on the list.
Remove -noIndexBuildRetry. Edit the /etc/init.d/mongod file to remove the -noIndexBuildRetry option that you added in Step 1 so we can revert back to the default behavior of resuming the index build.

Build (game engine) MongoDB

Published at DZone with permission of Dharshan Rangegowda, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending