DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Chaining API Requests With API Gateway
  • How To Approach Java, Databases, and SQL [Video]
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification
  • Managing Data Residency, the Demo

Trending

  • Chaining API Requests With API Gateway
  • How To Approach Java, Databases, and SQL [Video]
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification
  • Managing Data Residency, the Demo
  1. DZone
  2. Data Engineering
  3. Databases
  4. Mongo Aggregates and How to Explain Aggregate Queries

Mongo Aggregates and How to Explain Aggregate Queries

In this article, take a look at Mongo aggregates and see a tutorial on how to start an aggregate.

Ranvir Singh user avatar by
Ranvir Singh
·
Updated Jan. 14, 20 · Tutorial
Like (7)
Save
Tweet
Share
38.45K Views

Join the DZone community and get the full member experience.

Join For Free

Man typing on macbook on table

Today, we are going to talk about Mongo Aggregates: one of the best things that happened to Mongo. Let’s start by pulling out a few differences between the normal and Mongo database.

Mongo belongs to one of those NoSQL databases that disrupted the internet a few years ago. Everyone in the industry was talking about them. Everyone wanted to move their stack to these flexible databases and was talking about how the data needs to move in that direction and so on.

As the hype began to settle, people started realizing the movement of the stack will help only if they implement it correctly, and for most of them, the shift wasn’t even necessary.

NoSQL is a broad term and consists of a variety of database models. Mongo is also one of them. Others includeCassandra, Apache Spark, and many more.

MongoDB is a document-based, distributed database. In production, people tend to run it with 3 replicas. 1 is the master and the other two being the slaves. This provides redundancy and high data availability.

You may also be interested In:  MongoDB Tutorials and Articles: The Complete Collection

You can configure these to follow any guidelines, but by default, the reads and writes are handled by the primary replica, and the new data is moved on to the replica sets on each writes.

As the mongo doesn’t have a well-defined schema, it’s pretty hard to make queries from the data.

Mongoose is a library that can help you with this. It provides a lot of benefits, like creating hooks and indexes easily on the collections.

By the way, tables equivalent in Mongo are known as collections and rows equivalents are known as documents.

Mongo Aggregates

MongoDB aggregates make it easier to query data from any collection. It involves things like matching, getting data from other collections, selecting fields, and much more.

Let’s discuss a few of these aggregate queries.

Starting out a Mongo Aggregate

You can start the aggregate using the following code:

db.collection.aggregate([aggregate pipeline commands], options), where collection is the name of the collection on which aggregate is applied and db is the instance of the connected DB object.

The following are the commands that you can use in the aggregate pipeline.

Match in Mongo Aggregate

The match query can be called as the where query in SQL terms. You tell the aggregate to get the data that follows the given condition. It is recommended to keep the match query as soon as possible in the pipeline.

This will reduce the number of documents being returned from the query. It is also a good option to index the fields on which you run the match query to return the results faster.

For example: In a student database, you can make this query as follows:

{ $match: { "roll_number": 901 }}

or

{ $match: { "class": 5 }}

This query will return all the documents that satisfy the given query.

For all further parts of the pipeline, you can keep adding it to the main array of the pipeline.

Note: You can make use of the fields by appending a $ to the name of the fields whenever you want to use them.

Use Match as Early as Possible

You should use $match as early as possible to make use of indexes.

$match operations use suitable indexes to scan only the matching documents in a collection. When possible, place $match operators at the beginning of the pipeline.

Always try to explain the $match part of your queries and prefer to create compound indexes according to your queries.

Project in Mongo Aggregate

Project is the part where you tell the query which keys to pick from the given document.

JSON
xxxxxxxxxx
1
 
1
{
2
    $project: {
3
        student_name: 1,
4
        student_age: '$age'
5
    }
6
}


This will pick up only the fields with value 1 in the document. It is important to reduce the data being transferred to the next part of the pipeline.

It can also be used to change the name of the field. This is the SELECT equivalent of SQL commands.

_id field is added by default in the result.

Grouping in Mongo Aggregate

Group is used to group different things together. For example, if you want to calculate the sum of the age of students in different standards, the query will look something like this.

Java
xxxxxxxxxx
1
 
1
db.students.aggregate([
2
    { $group: { _id: "$class", total: { $sum: "$age" } } }
3
])


Here is the link to the mongo query. You can use all different types of aggregate queries like average, min, and max.

Lookup Other Collections in Mongo Aggregate

Lookup is one of the most important aggregate queries which you can use. This allows us to join different collections together and get data from the other collections as well.

Here is the query of the lookup.

The lookup can be used with the pipeline as well. With this type of lookup, you can apply a check and tell the pipeline, when the given lookup will run.

Java
xxxxxxxxxx
1
16
 
1
{
2
      $lookup:
3
         {
4
           from: "students",
5
           let: { studentId: "$_id", age: "$age" },
6
           pipeline: [
7
              { $match:
8
                 { $expr:
9
                     { $eq: [ "$student_id",  "$$studentId" ] }
10
                 }
11
              },
12
              { $project: { student_id: 1, age: 1 } }
13
           ],
14
           as: "data"
15
         }
16
    }


This is the implementation that you want to use when you want to run $match on the data being picked from the other collection.

The simplest implementation of $lookup is as follows.

JSON
xxxxxxxxxx
1
 
1
{
2
    $lookup: {
3
        from: <collection to join>,
4
        localField: <field from the input documents>,
5
        foreignField: <field from the documents of the "from" collection>,
6
        as: <output array field>
7
    }
8
}


How to Explain Mongo Aggregate Queries

To explain the queries, you will have to use the options in the aggregate to find the way in which queries are run.

Java
xxxxxxxxxx
1
 
1
db.getCollection("author").explain().aggregate([
2
    {
3
        $match: {
4
            "email" : "hello@heaven.god"
5
        }
6
    }
7
])


This will generate a simple output like this.

Java
xxxxxxxxxx
1
34
 
1
{
2
    "stages" : [
3
        {
4
            "$cursor" : {
5
                "query" : {
6
                    "email" : "hello@heaven.god"
7
                },
8
                "queryPlanner" : {
9
                    "plannerVersion" : 1.0,
10
                    "namespace" : "db_name.author",
11
                    "indexFilterSet" : false,
12
                    "parsedQuery" : {
13
                        "email" : {
14
                            "$eq" : "hello@heaven.god"
15
                        }
16
                    },
17
                    "winningPlan" : {
18
                        "stage" : "COLLSCAN",
19
                        "filter" : {
20
                            "email" : {
21
                                "$eq" : "hello@heaven.god"
22
                            }
23
                        },
24
                        "direction" : "forward"
25
                    },
26
                    "rejectedPlans" : [
27
 
          
28
                    ]
29
                }
30
            }
31
        }
32
    ],
33
    "ok" : 1.0
34
}


winningplan contains an object which tells us more about the winning plan which was used to run the query and queryPlanner contains an Array of plans which were tried. Mongo chooses the best plans and uses it for running the queries.

If you use the explain with executionStats it will give you things like docs Returned and docs Examined which can be helpful in finding the best suitable index for your collection.

Java
xxxxxxxxxx
1
 
1
db.getCollection("author").explain("executionStats").aggregate([
2
    {
3
        $match: {
4
            "email" : "hello@heaven.god"
5
        }
6
    }
7
])


Conclusion

There are a few more options to choose from. Aggregate is one of the most important parts of the Mongo database.

If you are dealing with this database daily, then it would be useful to know a little about it as well.

Thanks for stopping by, and let me know if you want to know about something else as well. I love to write about tech topics.

Also, let me know if I have made any mistakes in this post.

Note: The internal performance optimizer of the mongo aggregate optimizes the queries accordingly to make them as fast as possible.

Further Reading

MongoDB Pipelines With Examples

Running MongoDB Aggregations on Secondaries

Database

Published at DZone with permission of Ranvir Singh. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Chaining API Requests With API Gateway
  • How To Approach Java, Databases, and SQL [Video]
  • 5 Key Concepts for MQTT Broker in Sparkplug Specification
  • Managing Data Residency, the Demo

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: