Learn MongoDB With Me
In this introductory post of a series on MongoDB, learn how using Indexes can reduce your execution time to almost nothing.
Join the DZone community and get the full member experience.
Join For Freethis is going to be a series of article on mongodb. we are going to do some exercises with mongodb, talk about mongo shell, learn how to configure mongodb, learn what indexes are in mongodb, etc. we all know what an index is; you might have already done that with any relational databases like sql and mysql . have you ever done indexing for mongodb? if your answer is "no," no worries; here we are going to see indexes in mongodb . if it is a "yes," please read this post and correct me if i am wrong anywhere. let's begin now.
prerequisites
i hope you now know some basic information about mongodb. if not, i strongly recommend that you read these posts . i am assuming that you have already set up the environment for mongodb development. let's recall what you might have done so far.
- install mongodb.
- set the environment variable for mongodb.
- start mongodb services.
to set the environment variable for mongodb, you may have to add a new path to the system variable path with the value as c:\program files\mongodb\server\3.4\bin . please note that the version number will vary according to your mongodb version. once you are done with the above steps, you should be able to start both mongo server and mongo shell from the command line interface.
setting up mongodb using cli
now, let's just open our command line interface, and create the data directory for mongo. we will have to create a directory for the same. please go along with the below commands.
md \data
md \data\db
mongod
now let's open a new cli and run the command
mongo
. don't
worry about the warnings get, as we are not working in production data and we may not need to secure and optimize it.
exploring mongodb
once you are connected to mongodb, by default, you are connected to a test db. you can check that by running the command
mongodb enterprise > db
.
playing with mongo shell
let's just use a new database now.
mongodb enterprise > use mongoindex
switched to db mongoindex
mongodb enterprise >
please note that the database mongoindex doesn't exist as of now, as we haven't created it. still, mongo just switched our context to the new database. you can see this if you run the command
show dbs
.
the database will be created once we insert any document associated with it. now, we are going to create a new collection called user, so once we make the entry to this collection, the database will also be created automatically. let's do that.
mongodb enterprise > db.users.insert({"name":"sibees venu"})
writeresult({ "ninserted" : 1 })
mongodb enterprise >
now, if you run the
show dbs
command again, the database mongoindex will show up. if you ever need to see the collections you have in the db, you just need to run the command
show collections
.
mongodb enterprise > show collections
users
mongodb enterprise >
mongodb is very friendly when it comes to data; it doesn't require any schema to get it started. the learning is so easy, am i right?
the other benefit of mongodb is its javascript interpreted shell, where we can actually type javascript code and run it. to test it out, let's create a variable and use it.
mongodb enterprise > var name = "sibeesh venu"
mongodb enterprise > name
sibeesh venu
mongodb enterprise >
this way, we can interact with the database with a javascript program. now, let's go ahead and create a collection called numbers and insert 26,000 rows in it. we going to do that with a for loop. the mongo shell gives that kind of flexibility. let's see it in action.
mongodb enterprise > for(i=0;i<=26000;i++){
... db.numbers.insert({
... "number":i
... })
... }
writeresult({ "ninserted" : 1 })
mongodb enterprise >
so, we have done that. note that we are able to break the commands into multiple lines. this allows us to break the complex codes to a more readable format in the shell itself. sound good?
even though we have inserted 26,000 rows, it always shows
"ninserted" : 1
because it is counting a number of operations, not the individual documents. let's see this by checking the count now.
mongodb enterprise > db.numbers.count()
0
mongodb enterprise > db.numbers.count()
26001
mongodb enterprise >
please note that it is case-sensitive.
indexes in mongodb
now, if you need to see any particular record, you can always write the query in the shell as follows.
mongodb enterprise > db.numbers.find(
... {"number":24000}
... )
{ "_id" : objectid("5a8d3be2020a0071d115cf62"), "number" : 24000 }
mongodb enterprise >
so, in the query, we are using the function
find
with the filter
number: 24000
so that mongo can return the record with a number value of 24,000. now that we have the output we needed, would you like to see what just happened in the background? uuse the function
explain()
.
mongodb enterprise > db.numbers.find( {"number":24000} ).explain()
{
"queryplanner" : {
"plannerversion" : 1,
"namespace" : "mongoindex.numbers",
"indexfilterset" : false,
"parsedquery" : {
"number" : {
"$eq" : 24000
}
},
"winningplan" : {
"stage" : "collscan",
"filter" : {
"number" : {
"$eq" : 24000
}
},
"direction" : "forward"
},
"rejectedplans" : [ ]
},
"serverinfo" : {
"host" : "pc292716",
"port" : 27017,
"version" : "3.4.9",
"gitversion" : "876ebee8c7dd0e2d992f36a848ff4dc50ee6603e"
},
"ok" : 1
}
mongodb enterprise >
and if you need to get more information about the execution, you can pass the parameter
executionstats
to the
explain
function.
the parameter is always case-sensitive, you will get an error as shown below if you write it wrong. so, please make sure you are passing
executionstats
, not
executionstats
.
"mongodb enterprise > db.numbers.find( {"number":24000} ).explain("executionstats")
2018-02-21t15:12:34.197+0530 e query [thread1] error: explain verbosity must be one of {'queryplanner','executionstats','allplansexecution'} :
parseverbosity@src/mongo/shell/explainable.js:22:1
constructor@src/mongo/shell/explain_query.js:83:27
dbquery.prototype.explain@src/mongo/shell/query.js:520:24
@(shell):1:1″
mongodb enterprise > db.numbers.find( {"number":24000} ).explain("executionstats")
{
"queryplanner" : {
"plannerversion" : 1,
"namespace" : "mongoindex.numbers",
"indexfilterset" : false,
"parsedquery" : {
"number" : {
"$eq" : 24000
}
},
"winningplan" : {
"stage" : "collscan",
"filter" : {
"number" : {
"$eq" : 24000
}
},
"direction" : "forward"
},
"rejectedplans" : [ ]
},
"executionstats" : {
"executionsuccess" : true,
"nreturned" : 1,
"executiontimemillis" : 13,
"totalkeysexamined" : 0,
"totaldocsexamined" : 26001,
"executionstages" : {
"stage" : "collscan",
"filter" : {
"number" : {
"$eq" : 24000
}
},
"nreturned" : 1,
"executiontimemillisestimate" : 11,
"works" : 26003,
"advanced" : 1,
"needtime" : 26001,
"needyield" : 0,
"savestate" : 203,
"restorestate" : 203,
"iseof" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsexamined" : 26001
}
},
"serverinfo" : {
"host" : "pc292716",
"port" : 27017,
"version" : "3.4.9",
"gitversion" : "876ebee8c7dd0e2d992f36a848ff4dc50ee6603e"
},
"ok" : 1
}
mongodb enterprise >
now, you can see more information on the execution such as how much time it took for the execution and how many docs it examined. as you can see, it has examined all 26,001 records and took 13 milliseconds. that's just one case, where we had only a fewer number of records in the table. what if we have millions of records in it? examining all the records would be a bad idea. what should we do? what would be a permanent solution for this? this is where the importance of indexes comes into action.
let's create an index for the number that we are going to search.
mongodb enterprise > db.numbers.createindex({number:1})
{
"createdcollectionautomatically" : false,
"numindexesbefore" : 1,
"numindexesafter" : 2,
"ok" : 1
}
mongodb enterprise >
here, the number is a special variable, not a string. as you can see, we created the index. you can see that the property value of
createdcollectionautomatically
is
false
, as the collection had already created and it didn't have to create it again.
let's run our find query again.
mongodb enterprise > db.numbers.find( {"number":24000} ).explain("executionstats")
{
"queryplanner" : {
"plannerversion" : 1,
"namespace" : "mongoindex.numbers",
"indexfilterset" : false,
"parsedquery" : {
"number" : {
"$eq" : 24000
}
},
"winningplan" : {
"stage" : "fetch",
"inputstage" : {
"stage" : "ixscan",
"keypattern" : {
"number" : 1
},
"indexname" : "number_1",
"ismultikey" : false,
"multikeypaths" : {
"number" : [ ]
},
"isunique" : false,
"issparse" : false,
"ispartial" : false,
"indexversion" : 2,
"direction" : "forward",
"indexbounds" : {
"number" : [
"[24000.0, 24000.0]"
]
}
}
},
"rejectedplans" : [
{
"stage" : "fetch",
"inputstage" : {
"stage" : "ixscan",
"keypattern" : {
"number" : 24000
},
"indexname" : "number_24000",
"ismultikey" : false,
"multikeypaths" : {
"number" : [ ]
},
"isunique" : false,
"issparse" : false,
"ispartial" : false,
"indexversion" : 2,
"direction" : "forward",
"indexbounds" : {
"number" : [
"[24000.0, 24000.0]"
]
}
}
}
]
},
"executionstats" : {
"executionsuccess" : true,
"nreturned" : 1,
"executiontimemillis" : 36,
"totalkeysexamined" : 1,
"totaldocsexamined" : 1,
"executionstages" : {
"stage" : "fetch",
"nreturned" : 1,
"executiontimemillisestimate" : 0,
"works" : 3,
"advanced" : 1,
"needtime" : 0,
"needyield" : 0,
"savestate" : 1,
"restorestate" : 1,
"iseof" : 1,
"invalidates" : 0,
"docsexamined" : 1,
"alreadyhasobj" : 0,
"inputstage" : {
"stage" : "ixscan",
"nreturned" : 1,
"executiontimemillisestimate" : 0,
"works" : 2,
"advanced" : 1,
"needtime" : 0,
"needyield" : 0,
"savestate" : 1,
"restorestate" : 1,
"iseof" : 1,
"invalidates" : 0,
"keypattern" : {
"number" : 1
},
"indexname" : "number_1",
"ismultikey" : false,
"multikeypaths" : {
"number" : [ ]
},
"isunique" : false,
"issparse" : false,
"ispartial" : false,
"indexversion" : 2,
"direction" : "forward",
"indexbounds" : {
"number" : [
"[24000.0, 24000.0]"
]
},
"keysexamined" : 1,
"seeks" : 1,
"dupstested" : 0,
"dupsdropped" : 0,
"seeninvalidated" : 0
}
}
},
"serverinfo" : {
"host" : "pc292716",
"port" : 27017,
"version" : "3.4.9",
"gitversion" : "876ebee8c7dd0e2d992f36a848ff4dc50ee6603e"
},
"ok" : 1
}
mongodb enterprise >
as we had given the index information on what exactly we are going to search, it examines only that document when we run the query. that's why the value of the property
totaldocsexamined
is
1
. indexing will not have much impact on a database that has few records in it, but it has a massive effect on very large datasets that have millions of records. using indexes can reduce the execution time to almost nothing.
with that, we are done with this post. i will be posting the continuation part of this series very soon.
thanks a lot for reading. did i miss anything that you think is needed? did you find this post useful? please share me your valuable suggestions and feedback.
if you enjoyed this article and want to learn more about mongodb, check out this collection of tutorials and articles on all things mongodb.
Published at DZone with permission of Sibeesh Venu, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Auditing Tools for Kubernetes
-
SeaweedFS vs. JuiceFS Design and Features
-
How to Supplement SharePoint Site Drive Security With Java Code Examples
-
Core Knowledge-Based Learning: Tips for Programmers To Stay Up-To-Date With Technology and Learn Faster
Comments