Over a million developers have joined DZone.

Essential Reference for MongoDB

DZone's Guide to

Essential Reference for MongoDB

· Database Zone
Free Resource

Learn NoSQL for free with hands-on sample code, example queries, tutorials, and more.  Brought to you in partnership with Couchbase.

The following is a quick tutorial / reference to help you start using the MongoDB database.

We’ll start with a quick definition, and then go into the following topics, using a simple example: installing, inserting, querying, updating, deleting, indexes and explain, MapReducing, drivers, and distributing.

MongoDB is a document oriented database built with the intention of being able to deal with very big amounts of data with good performance, to adjust to the increasing data that modern applications have to deal with, particularly when “in the cloud”. MongoDB is also intended to keep some of the great characteristics from Relational Databases, like the capacity to execute dynamic queries, so that you don't lose this great flexibility.



Installing MongoDB for using in our tests is really easy: just go to http://www.mongodb.org/downloads and download the pertinent version (This references uses Linux and MacOSX).

After download, gunzip and untar the file and that’s it. It is installed.

To start mongo server: go to the untared directory, go to the bin directory and execute ./mongod. (This will assume you have a /data/db directory in your system. If you don’t, then create one.)



MongoDB works with Documents. In Mongo, a Document is simply a binary representation of a JSON object called BSON. You can simply of a Document as a JSON object, and the 'binary' is just the way Mongo represents this object internally.

In Mongo, every document must belong to a Collection. (A Collection can be though as a table of a RDBMS, but just as a temporary aid to understanding. MongoDB Collections and RDBMS tables are very different things.) Every Collection belongs to a Database.

So let’s say we want to insert a Car Document, into a Cars Collection that belong to the Concesionary Database. We would do the following:

  • From the bin directory, and with the server started, we execute ./mongo to open the interactive Shell. The interactive shell of Mongo allows us to interact with the database server using JavaScript.
  • Next, we change to use our concesionary Database (even although the database doesn’t exist yet, this command will work):
use concesionary
  • Next, we insert our new Car in the Cars Collection (again, the Ccollection doesn’t exist yet, but it (and the concesionary Database) will get created when you insert the first element):



As you can see, we are inserting a new Car, which is basically a JSON object (including simple types, subdocument types, and arrays).

Let’s insert another Car to use in the next section (on querying):




MongoDB allows you a lot of flexibility in querying, very close to what you can do with SQL. You can use lots of filters, comparisons, etc. We just do a couple of basics queries here, to get your feet wet.

In general you query MongoDB by calling the find method on the collection, and passing a JSON document with the selections you want to query on:


{ "_id" : ObjectId("4dde4d1c6eb878af72075592"), "maker" : "ferrari", "model" : "f50", "acceleration" : { "speed100" : 3, "speed200" : 9 }, "colors" : [ "white", "black" ] }

{ "_id" : ObjectId("4dde4f3d6eb878af72075593"), "maker" : "fiat", "model" : "500", "acceleration" : { "speed100" : 10, "speed200" : "NEVER" }, "colors" : [ "blue", "red" ] }


{ "_id" : ObjectId("4dde4d1c6eb878af72075592"), "maker" : "ferrari", "model" : "f50", "acceleration" : { "speed100" : 3, "speed200" : 9 }, "colors" : [ "white", "black" ] }


{ "_id" : ObjectId("4dde4f3d6eb878af72075593"), "maker" : "fiat", "model" : "500", "acceleration" : { "speed100" : 10, "speed200" : "NEVER" }, "colors" : [ "blue", "red" ] }


{ "_id" : ObjectId("4dde4d1c6eb878af72075592"), "maker" : "ferrari", "model" : "f50", "acceleration" : { "speed100" : 3, "speed200" : 9 }, "colors" : [ "white", "black" ] }



Basic updating is pretty straightforward. It needs a filter document, like find, and a parameter indicating how to modify the document:



{ "_id" : ObjectId("4dde54b56eb878af72075594"), "maker" : "ferrari", "model" : "f40", "acceleration" : { "speed100" : 3, "speed200" : 9 }, "colors" : [ "white", "black" ] }



Deleting is even more starightforward than updatin. It just requires the document filter (or nothing, if you want to delete all the documents in the collection):


{ "_id" : ObjectId("4dde4f3d6eb878af72075593"), "maker" : "fiat", "model" : "500", "acceleration" : { "speed100" : 10, "speed200" : "NEVER" }, "colors" : [ "blue", "red" ] }


Creating indexes, and query explain:

Indexes, as in any other database, are extremely important in MongoDB, and extremely important to get right. They work like you may expect, and allow you to accelerate the speed and performance dramatically of your queries, if applied right. You can create compund indexes as well. Here we will touch the basics once again.

Let’s insert our two cars again:



MongoDB automatically creates an index for the _id property of its documents. We can query existent indexes like this:


{ "name" : "_id_", "ns" : "concesionary.cars", "key" : { "_id" : 1 }, "v" : 0 }

Our application probably will make a lot of queries per car maker, so we will add an index to the maker property like this:

db.cars.ensureIndex({maker: 1})

Now when we query for existent indexes we get our new index:

{ "name" : "_id_", "ns" : "concesionary.cars", "key" : { "_id" : 1 }, "v" : 0 }
{ "_id" : ObjectId("4dde57fe6eb878af72075597"), "ns" : "concesionary.cars", "key" : { "maker" : 1 }, "name" : "maker_1", "v" : 0 }

So how can we see if some query is using our index? Simple enough: we use the explain method to do so. But before doing that, let’s remove the index and run explain without it.

db.runCommand({deleteIndexes: "cars", index: "maker_1"})

{ "name" : "_id_", "ns" : "concesionary.cars", "key" : { "_id" : 1 }, "v" : 0 }

We removed the index; now let’s see explain in action:


"cursor" : "BasicCursor",
"nscanned" : 2,
"nscannedObjects" : 2,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {


The main things to take a look at when running explain (for the purposes of our discussion) are: the type of cursor, the nscanned, and the n attributes.

The cursor “BasicCursor” is simply a cursor that scans through all the collection to get the query results. nscanned is the total documents scanned. n is the total documents returned. In an ideal world, the n and nscanned should be the same.

Now let’s create the index again and rerun the explain for the query:

db.cars.ensureIndex({maker: 1})
"cursor" : "BtreeCursor maker_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"maker" : [

We can see the different results. We are using the index (indicated by the cursor property) and the nscanned and n properties have the same value. We are just scanning the elements we are returning.

Map Reducing:

Apart from the common grouping operations allowed by MongoDB (like sum, max, etc.) we can use MapReduce for more fine-grained and customized grouping requirements -- and it is built into the mongodb functionality. (For an explanation of MapReduce see: http://cscarioni.blogspot.com/2010/11/hadoop-basics.html). Here we show an extremely simple MapReduce:

Let's say we want to simply count all the cars per maker.

First, we insert a new Fiat into the collection (do that yourself).

Then we define our MapReduce in one line, like this:

emit (this.maker,{number:1})
var total = 0;
total += value.number;
return {total:total}},"result"

When we run it, we get this:

"result" : "result",
"timeMillis" : 3,
"counts" : {
"input" : 3,
"emit" : 3,
"output" : 2
"ok" : 1,

and the result of the counting is in the new result collection:


{ "_id" : "ferrari", "value" : { "number" : 1 } }
{ "_id" : "fiat", "value" : { "total" : 2 } }

As we can see, the MapReduce method receives a map function, a reduce function, and normally the name of the collection to store the results.



In this section we aren’t going to say a lot. Simpl:  there already exist in MongoDB drivers for the most common programming languages out there. They all work kind of the same (taking into account the advantages and limitations of each programming language) and they are pretty easy to start experimenting with.



One of the most important characteristics of MongoDB is its support for distribution, from creating Replica Sets to Sharding. I’ll cover that soon, in another article. For now, let's just say that the sharding model is really powerfull and allows for transparent failover, and transparent sharding and distribution of data chunks across the sharded cluster.


The Getting Started with NoSQL Guide will get you hands-on with NoSQL in minutes with no coding needed. Brought to you in partnership with Couchbase.


Published at DZone with permission of Carlo Scarioni, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}