The aggregation framework is a vital cog in the mongodb infrastructure. It helps you analyze, summarize and aggregate the data stored in mongodb. Refer to this blog post for more details about the aggregation framework in MongoDB 2.6.
In the 2.6 release MongoDB made a subtle but significant change in the way the underlying aggregation pipelines execute in a sharded environment. When working with sharded collections MongoDB splits the pipeline into two stages. The first stage or the "$match" phase runs on each shard and selects the relevant documents. If the query planner determines that a shard is not relevant based on the shard keys then this phase is not executed on that shard.
The subsequent stages run only on the "primary" shard for the collection. This shard merges the data from the other shards and runs the rest of the pipeline. This results in considerable more load on the primary shard of the collection being aggregated. Here is an example from one of our customers running three shards and using primarily aggregation queries
As you see the load on the first shard is consistently 3-4 times the other reason. This is an extreme example since this in case the second and third shards were added later, hence the primary shard for all the collections is the first shard. So essentially the subsequent stages of all our aggregation jobs run only on Shard1. If you examine the logs on the primary shard you will see a number of "merge" commands retrieving data from the other shards.
Prior to 2.6 the subsequent stages of the aggregation pipeline used to run on your mongos servers and not on the primary shard.
So how do you handle this uneven load distribution? You have a couple of options 1. If you are running aggregations on multiple collections ensure that the "primary shards" of the collections are evenly spread across your shards. 2. If you have a high aggregation load on just one collection you might need to use slightly larger machines for your primary shard.
This article originally appeared on the mongodirector blog - https://scalegrid.io/blog/mongodb-shards-and-unbalanced-aggregation-loads/.