Over a million developers have joined DZone.

MongoDB Sharding

· Database Zone

Build fast, scale big with MongoDB Atlas, a hosted service for the leading NoSQL database. Try it now! Brought to you in partnership with MongoDB.

If I should have made some safe bets on the near future, I would choose two: Hadoop and MongoDB. 

There is a huge demand for both technologies and many players consider these technologies as a foundation for their future products.
MySQL Sharding was a major issue for large scale installations and it is the same for mongoDB large installations.
Back to Basics
Mongo is pretty similar to a regular database, but it has two main advantages: 1) Software engineers love it as it can easily be used for object persist-ency and 2) it support unstructured objects (documents) that can easily store different objects based on the same virtual class.
mongoDB terms

  1. Database: database
  2. Collections: very similar to tables.
  3. Documents: very similar to rows. Yet, a document can be as flexible as a JSON document can be. For example, it may include 1 to many fields in the document itself.
  4. mongod: a mongoDB instance or shard.
  5. Chunk: a 64MB storage unit that stores documents.
  6. Config database: Chunks to mongos mapping directory.
Why use sharding?
  1. Support large dataset using commodity servers.
  2. Support high IO requirements using commodity disks.
What are mongoDB sharding features?

  1. Range-based Data Partitioning: a very similar method to MySQL partitioning. You should choose one or more fields (shard key) that sharding will be based on. You should choose a shard key according to the business logic, like splitting according to account id in a SaaS application.
  2. Automatic Data Volume Distribution: mongoDB will take care of the shards balancing by itself according to the chosen shard key.
  3. Transparent Query Routing: mongoDB takes care of queries map reduce to multiple shared by itself when a query does not match the shard key (very much like Hadoop).
Key Recommendations for mongoDB Sharding
  1. Sufficient Carnality: choose a shard key that can be split later to more shards if a database size is getting too large (exceeds chunk size).
  2. Uniform Distribution: choose a sharding key that will spread a in uniform distribution to avoid unbalanced design.
  3. Distribute Write Operations: if you have a billing system, prefer to shard according to account id rather than shard according to billing month. Otherwise, in a given day, probably only a single shard will be used.
  4. Query according to the shard key: if any of your queries will include the shard key, each of your queries will result in a single shard query. Otherwise, it will generate N queries (one per shard).
Technical Aspects for mongoDB Sharding
  1. Every sharded collection must have an index that its first fields are the shard key (use shardCollection for that).
  2. Chunk size default limit is 64MB
  3. When a chunk reaches this limit, mongoDB will split it to two.
  4. If chunks are not distributed uniformly, mongoDB will start migrating chunks between different mongos.
  5. Cluster Balancer is taking care of this process.
  6. Balancing can cause performance issues and therefore can be restricted to off peak hours (nights and weekends for example) using balancing windows.
  7. The shards mapping to mongos is saved at the config database.
  8. Replication should be considered as well  a complementary method.
Bottom Line
mongoDB brings to the table an out of the box sharding solution that can scale your operations. Now, you only need to analyze your needs and select the right solution for them.

Now it's easier than ever to get started with MongoDB, the database that allows startups and enterprises alike to rapidly build planet-scale apps. Introducing MongoDB Atlas, the official hosted service for the database on AWS. Try it now! Brought to you in partnership with MongoDB.


Published at DZone with permission of Moshe Kaplan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}