NoSQL is a hot buzz in the air for a pretty long time (well, it's not only a buzz anymore).
However, when should we really use it?
Best Practices for MongoDB
NoSQL products (and among them MongoDB) should be used to meet challenges. If you have one of the following challenges, you should consider MongoDB:
You Expect a High Write Load
MongoDB by default prefers high insert rate over transaction safety. If you need to load tons of data lines with a low business value for each one, MongoDB should fit. Don't do that with $1M transactions recording or at least in these cases do it with an extra safety.
You need High Availability in an Unreliable Environment (Cloud and Real Life)
Setting replicaSet (set of servers that act as Master-Slaves) is easy and fast. Moreover, recovery from a node (or a data center) failure is instant, safe and automatic
You need to Grow Big (and Shard Your Data)
Databases scaling is hard (a single MySQL table performance will degrade when crossing the 5-10GB per table). If you need to partition and shard your database, MongoDB has a built in easy solution for that.
Your Data is Location Based
MongoDB has built in spacial functions, so finding relevant data from specific locations is fast and accurate.
Your Data Set is Going to be Big (starting from 1GB) and Schema is Not Stable
Adding new columns to RDBMS can lock the entire database in some database, or create a major load and performance degradation in other. Usually it happens when table size is larger than 1GB (and can be major pain for a system like BillRun that is described bellow and has several TB in a single table). As MongoDB is schema-less, adding a new field, does not effect old rows (or documents) and will be instant. Other plus is that you do not need a DBA to modify your schema when application changes.
You Don't have a DBA
If you don't have a DBA, and you don't want to normalize your data and do joins, you should consider MongoDB. MongoDB is great for class persistence, as classes can be serialized to JSON and stored AS IS in MongoDB. Note: If you are expecting to go big, please notice that you will need to follow some best practices to avoid pitfalls.
Real World Case Study: Billing
In the last ILMUG, Ofer Cohen presented BillRun, a next generation Open Source billing solution that utilizes MongoDB as its data store. This billing system runs in production in the fastest growing cellular operator in Israel, where it processes over 500M CDRs (call data records) each month. In his presentation Ofer presented how this system utilizes MongoDB advantages:
- Schema-less design enables rapid introduction of new CDR types to the system. It let BillRun keep the data store generic.
- Scale BillRun production site already manages several TB in a single table, w/o being limited by adding new fields or being limited by growth
- Rapid replicaSet enables meeting regulation with easy to setup multi data center DRP and HA solution.
- Sharding enables linear and scale out growth w/o running out of budget.
- With over 2,000/s CDR inserts, MongoDB architecture is great for a system that must support high insert load. Yet you can guarantee transactions with findAndModify (which is slower) and two-phase commit (application wise).
- Developer oriented queries, enable developers write a elegant queries.
- Location based is being utilized to analyze users usage and determining where to invest in cellular infrastructure.
MongoDB is great tool, that should be used in the right scenarios to gain unfair advantage in your market. BillRun is a fine example for that.