MongoDB 1.6 Arrives With Auto-Sharding and Replica Sets
Companies like Foursquare, bit.ly, and BoxedIce have been using the beta version of MongoDB 1.6 in production for a few months and they have been very impressed with the new functionality in today's release. Two main new features in 1.6 are Auto-Sharding and Replica Sets. Auto-sharding allows the automation of horizontal scaling for MongoDB across multiple nodes, and Replica sets will automate failover and redundancy so that data existence on any number of servers across multiple datacenters is assured. MongoDB is a NoSQL (non-relational) document store.
How to Make Replica Sets
Replica sets are a new method for database replication in MongoDB 1.6. They support automatic failover, automatic recovery of nodes, and multi data center operations. A replica set will hold any number of members and the data will fully exist on each one. There is always a primary server which receives reads and writes while other members accept reads only. If your primary fails, another member will take over automatically. Once you have the replica set, you can begin sharding.
Setting up Replica Sets
How to use Auto-Sharding
Configuring the auto-sharding has three parts in MongoDB 1.6. First, you need to set up the shard servers with the --shardsvr parameter when starting each mongod instance. The config servers run with the --configsvr parameter to store metadata for the shard. The documentation states that “a production shard cluster will have three config server processes, each existing on a separate machine. Writes to config servers use a two-phase commit to ensure an atomic and replicated transaction of the shard cluster’s metadata.” Finally, the shard servers and config servers connect to the 'mongos', which are the processes that your clients connect to when routing queries to the appropriate shards. They are self-contained and they usually run on each of your app servers.
Foursquare moved all check-ins and user data to MongoDB from PostgreSQL once they no longer could use a single physical machine. “With foursquare's growing popularity, we were fast approaching the point where it would no longer be feasible to store user checkins on a single physical machine.,” said Harry Heymann, Engineering Lead at Foursquare. “MongoDB's auto-sharding capabilities enabled us to easily transition to a multi-node cluster that will enable us to continue to grow for the foreseeable future.”
Another MongoDB user is Bit.ly, a service with 50 million users and 12.5 billion 'shortens' per month. They store user history in MongoDB. Other major production users of MongoDB include the New York Times, Etsy, SourceForge, and Shutterfly. Downloads of the database exceeded 50,000 in the month of July.
For more info on how BoxedIce set up their auto-sharding, check out their recent blog entry.