Over a million developers have joined DZone.

So Many Ways to Start Your Mongo

· Database Zone

Starting up a vanilla MongoDB instance is super easy, it just needs a port it can listen on and a directory where it can save your info. By default, Mongo listens on port 27017, which should work fine (it’s not a very commonly used port). We’ll create a new directory for database files:

$ mkdir -p ~/dbs/mydb # -p creates parent directories if they don't exist

And then start up our database:

$ cd 
$ bin/mongod --dbpath ~/dbs/mydb

…and you’ll see a bunch of output:

$ bin/mongod --dbpath ~/dbs/mydb
Fri Apr 23 11:59:07 Mongo DB : starting : pid = 9831 port = 27017 dbpath = /data/db/ master = 0 slave = 0  32-bit 
** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
**       see http://blog.mongodb.org/post/137788967/32-bit-limitations for more
Fri Apr 23 11:59:07 db version v1.5.1-pre-, pdfile version 4.5
Fri Apr 23 11:59:07 git version: f86d93fd949777d5fbe00bf9784ec0947d6e75b9
Fri Apr 23 11:59:07 sys info: Linux ubuntu 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 BOOST_LIB_VERSION=1_38
Fri Apr 23 11:59:07 waiting for connections on port 27017
Fri Apr 23 11:59:07 web admin interface listening on port 28017

Now, Mongo will “freeze” like this, which confuses some people. Don’t worry, it’s just waiting for requests. You’re all set to go.

Set up a slave, Maeve

As we’re running master and slave on the same machine, they’ll need separate ports. We’ll use port 10000 for the master and 20000 for the slave. We also need separate directories for data, so we’ll create those:

$ mkdir ~/dbs/master ~/dbs/slave

Now we start the master database:

$ bin/mongod --master --port 10000 --dbpath ~/dbs/master

And then the slave, in a different terminal:

$ bin/mongod --slave --port 20000 --dbpath ~/dbs/slave --source localhost:10000

The “source” option specifies where the master is that the slave should replicate data from.

Now, if we want to add another slave, we need to go though the herculean effort of choosing a port and creating a new directory:

$ mkdir ~/dbs/slave2
$ bin/mongod --slave --port 20001 --dbpath ~/dbs/slave2 --source localhost:10000

Tada! Two slaves, one master. For more information on master-slave, see the core docs on it and my previous post.

This example puts the master server and slave server on the same machine, but people generally have a master on one machine and a slave on another. It works fine to put them on a single machine, it just defeats the point of a bit.

Get auto-failover… Rover

Okay, so there aren’t many people named Rover, but you come up with a rhyme for “auto-failover” (I tried “replica”, too).

Replica pairs are cool because it’s like master-slave, but you get automatic failover: if the master becomes unavailable, the slave will become a master. So, it’s basically the same as master-slave, but the servers know about each other and there is, optionally, an arbiter server that doesn’t do anything other than resolve “disputes” over who is master.

When could the arbiter come it in handy? Suppose the master’s network cable is pulled. The server still thinks it’s master, but no one else knows it’s there. The slave becomes master and the rest of the world goes along happily. When the master’s network cable gets plugged back in, now both servers think they’re master! In this case, the arbiter steps in and gently informs the master who’s behind in the times that he is now a slave.

You don’t have to set up an arbiter, but we will since it’s good practice:

$ mkdir ~/dbs/arbiter ~/dbs/replica1 ~/dbs/replica2
$ bin/mongod --port 50000 --dbpath ~/dbs/arbiter

Now, in separate terminals, you start each of the replicas:

$ bin/mongod --port 60000 --dbpath ~/dbs/replica1 --pairwith localhost:60001 --arbiter localhost:50000

And then the other one:

$ bin/mongod --port 60001 --dbpath ~/dbs/replica2 --pairwith localhost:60000 --arbiter localhost:50000

After they’ve been running for a bit, try killing (Ctrl-C) one, then restarting it, then killing the other one, back and forth.

For more information on replica pairs, see the core docs.

What’s this? Replica pairs are evolving! *voop* *voop* *voop*

Replica pairs have evolved into… replica sets! Well, okay, they haven’t yet, but they’re coming soon. Then you’ll be able to have an arbitrary number of servers in the auto-failover ring.

Make a new cluster, Buster

For the grand finale, sharding. Sharding is how you distribute data with Mongo. If you don’t know what sharding is, check out my previous post explaining how it works.

First of all, download the latest 1.5.x nightly build from the website. Sharding is changing rapidly, you want the latest and greatest here, not stable.

We’re going to be creating a three-node cluster. So, same as ever, create your database directories. We want one directory for the cluster configuration and three directories for our shards (nodes):

$ mkdir ~/dbs/config ~/dbs/shard1 ~/dbs/shard2 ~/dbs/shard3

The config server keeps track of what’s where, so we need to start that up first:

$ bin/mongod --configsvr --port 70000 --dbpath ~/dbs/config

The mongos is just a request router that runs on top of the config server. It doesn’t even need a data directory, we just tell it where to look for the configuration:

$ bin/mongos --configdb localhost:70000

Note the “s”: the router is called “mongos”, not “mongod”. We haven’t specified a port for it, so it’ll listen on the default port (27017).

Okay! Now, we need to set up our shards. Start these each up in separate terminals:

$ bin/mongod --shardsvr --port 71000 --dbpath ~/dbs/shard1
$ bin/mongod --shardsvr --port 71001 --dbpath ~/dbs/shard2
$ bin/mongod --shardsvr --port 71002 --dbpath ~/dbs/shard3

mongos doesn’t actually know about the shards yet, you need to tell it to add these servers to the cluster. The easiest way is to fire up a mongo shell:

$ bin/mongo
MongoDB shell version: 1.5.1-pre-
url: test
connecting to: test
type "help" for help

Now, we add each shard to the cluster:

> db = connect("localhost:70000/admin");
connecting to: localhost:70000
> db.runCommand({addshard : "localhost:71000", allowLocal : true})
    "added" : "localhost:71000",
    "ok" : 1
> db.runCommand({addshard : "localhost:71001", allowLocal : true})
    "added" : "localhost:71001",
    "ok" : 1
> db.runCommand({addshard : "localhost:71002", allowLocal : true})
    "added" : "localhost:71002",
    "ok" : 1

mongos expects shards to be on remote machines and by default won’t allow you to add local shards (i.e., shards with “localhost” in the name). Since we’re just playing around, we specify “allowLocal” to override this behavior. (Note that “addshard” IS NOT camel-case, and allowLocal IS camel-case, because we’re consistent like that.)

Congratulations, you’re running a distributed database!

What do you do now? Well, use it just like a normal database! Connect to “localhost:27017″ and proceed normally (or, as normally as possible… please report any bugs to our bugtracker!). Try the tutorial (since you’ve already got the shell open) or connect through your favorite driver and play around.

Connecting to mongos should be an identical experience to connecting to a normal Mongo server. Behind the scenes, it splits up your requests/data across the shards so you can concentrate on making your application, not scaling it.

P.S. Obviously, this example setup is full of single points of failure, but that’s completely avoidable. I can go over how to set up distributed MongoDB with zero single points of failure in a follow-up post, if people are interested.




{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}