Using MongoDB $IndexStats to Identify (and Remove) Unused Indexes
Unused indexes can wreak havoc on your database. Fortunately, MongoDB has come up with a way to find them so they can be dealt with.
Join the DZone community and get the full member experience.
Join For FreeNote: Available for Dedicated plans on mLab*
Proper indexing is critical to database performance. A single unindexed query is enough to cause significant performance degradation.
It is relatively easy to spot a missing index using mLab’s Slow Query Analyzer, but the tool doesn’t provide an obvious way to identify and remove indexes that aren’t actually being used.
Because unused indexes impact write performance and consume valuable resources, periodic index review and maintenance is recommended. MongoDB 3.2 introduced a new feature to help identify unused indexes:
* This feature will also be available on our Sandbox and Shared plans once the fix for SERVER-26734 is available for supported versions.
The $indexStats Operator
The $indexStats aggregation pipeline operator displays a running tally of index usage since the server was last started. The operator reports on the usage for all of the indexes within a specified collection, and (with additional aggregation operators) can be targeted at a particular index. Knowing if and how often indexes are being used allows an administrator to make informed decisions on which indexes are providing benefit.
How to Analyze Index Usage With $indexStats
It’s not necessarily obvious which collections might contain unused indexes. To obtain a comprehensive list of all index usage, you’ll need to run $indexStats on each collection.
Until SERVER-26734 is resolved, you will need to connect with an admin database user to run $indexStats.
Connecting With an Admin Database User
You will need to create an admin database user if you don’t already have one.
To run the $indexStats command, you will need to connect to your database with an admin database user. The following command will use your “admin” database user credentials to authenticate to the “admin” database, then connect to the target database (“myDatabase”):
> mongo ds123456-a0.mlab.com:12345/myDatabase -u <adminUser> -p <adminUserPassword> --authenticationDatabase admin
Running $indexStats on an Entire Collection
Now that you are connected to the target database, the $indexStats command can be run using the MongoDB shell:
> db.myColl.aggregate( { $indexStats: { } } )
{
"name" : "color_1",
"key" : {
"color" : 1
},
"host" : "examplehost.local:27017",
"accesses" : {
"ops" : NumberLong(50),
"since" : ISODate("2017-01-06T16:52:50.744Z")
}
}
{
"name" : "type_1",
"key" : {
"type" : 1,
},
"host" : "examplehost.local:27017",
"accesses" : {
"ops" : NumberLong(0),
"since" : ISODate("2017-01-06T16:52:48.362Z")
}
}
{
"name" : "name_1",
"key" : {
"name" : 1
},
"host" : "examplehost.local:27017",
"accesses" : {
"ops" : NumberLong(100),
"since" : ISODate("2017-01-06T16:32:44.609Z")
}
}
The return document includes the following fields:
Output Field | Description |
name | Index name |
key | Index key specification |
host | The hostname and port of the mongod process |
accesses.ops | The number of operations that used the index |
accesses.since | The time from which MongoDB started gathering the index usage statistics |
Source: https://docs.mongodb.com/manual/reference/operator/aggregation/indexStats/
Interpreting $indexStats
Every database query, update, or command that uses an index will be counted toward that index’s usage statistics.
- The “name”, “key”, and “host” output fields provide the metadata for each index.
- The “accesses.ops” value displays the number of operations that have used the index. Any indexes with zero access operations suggest a potentially unused index which can potentially be deleted.
- The “accesses.since” value is the point in time from which MongoDB began gathering the index statistics. This value is set either upon index creation or as of a mongod server restart.
Importantly, note that $indexStats shows index data as of the last database server process restart. Therefore, these running tallies are wiped out and restarted with each server restart. Keep in mind that a database server process restart can occur during maintenance events, plan changes, intentional deployment restarts, or as a part of unexpected failures requiring restart.
If you would like a fresh set of statistics you can choose to perform an intentional cluster restart. If you would prefer not to perform a cluster restart, you can use $indexStats to sample from two points in time and calculate the difference in access operations over time for each index.
Running $indexStats for a Particular Index
You can use the $match operator within the aggregation pipeline to specify a particular index. This allows you to match indexes based on index name or index key. You can run the following commands in the MongoDB shell:
Based on index name:
> db.myColl.aggregate([{$indexStats: {}}, {$match: {"name": "color_1"}}])
Based on index key:
> db.myColl.aggregate([{$indexStats: {}}, {$match: {"key": {"color": 1}}}])
The return document only displays the index statistics for the particular key.
{
"name" : "color_1",
"key" : {
"color" : 1
},
"host" : "examplehost.local:27017",
"accesses" : {
"ops" : NumberLong(50),
"since" : ISODate("2017-01-06T16:52:50.744Z")
}
}
Removing Unused Indexes
Proceed With Caution
As with all delete operations on the database, always err on the side of caution when removing an index.
- Do not drop an index if there is any uncertainty surrounding its use.
- Accidentally removing a necessary index can result in significant performance degradation.
- Closely monitor database performance immediately after making index changes.
In addition, here are some checks to perform before removing an unused index:
- Are there infrequent operations which require the index?
- Are there query patterns that are failing to use the index?
- Are there plans to use the index in the near future?
More information about indexing can be found in our documentation: http://docs.mlab.com/indexing/
Take a Backup First (Optional)
An additional precaution is to take a backup before dropping a series of unused indexes. We recommend a block storage snapshot (e.g., an EBS Snapshot) over a mongodump since this type of backup tends to be orders of magnitude faster to both take and restore.
Drop the Unused Index
After reviewing the considerations above, you can proceed with removing any unused indexes.
To drop the “color_1” index, perform the following command in the MongoDB shell:
> db.myColl.dropIndex( { "color": 1 } );
Thank you for reading! If you have questions on this exciting new feature or on MongoDB indexing/performance in general, please email our team at for help.
Published at DZone with permission of Nathan M. Park, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments