High Performance Benchmarking: MongoDB and NoSQL Systems
Join the DZone community and get the full member experience.Join For Free
[This article was written by Sam Bhat]
Recently we (U.S.A) published a comprehensive independent database comparison, measuring performance across multiple dimensions using the Yahoo! Cloud Serving Benchmark (YCSB). In these tests we observed that MongoDB overwhelmingly outperformed key value stores, in terms of throughput and latency, across a number of configurations.
We tested three different configurations to understand how each of these systems makes trade offs in durability to maximize performance.
Testing Maximum Performance
We started by testing for maximum performance. In this configuration, each system can potentially lose data when a node fails, which may be acceptable for certain niche applications. When all three databases are configured the same way, MongoDB provides 20% greater throughput than Cassandra, and 50% greater throughput than Couchbase.
Testing Performance with No Possible Data Loss
We then tested a configuration that prevents any possible data loss. In this configuration, MongoDB outperforms Cassandra and Couchbase by more than 25x, with latency that is more than 95% better than Cassandra, and more than 99.5% better than Couchbase.
Testing High Performance with Minimal Possible Data Loss
Finally, we tested a configuration that provides excellent performance and minimal possible data loss in the event of a node failure, MongoDB provides 3x greater throughput than Cassandra in read-intensive workloads, and 70% higher throughput in write-intensive workloads, while providing 80% lower latency. Couchbase does not provide an equivalent balanced configuration.
We were surprised to find that, even with its more extensive feature set, MongoDB outperformed key value stores at what they do best. However, as YCSB only tests a small set of the requirements necessary for any application, organizations should carefully test all their requirements to make smart choices about their database technology. We have posted the tests on GitHub so others can reproduce our findings.
About the Author - Sam Bhat
Sam Bhat, has over 25 years of experience in building & managing successful businesses in a global environment. Sam is a serial entrepreneur and has several successful ventures in the technology industry to his credit. His current focus is on building a Real Time Analytics Platform ( RTAP ), targeted at industry specific use cases where the availability of data from a very large number of sources on a real time basis has opened up new and exciting possibilities for decision making. Sam has a MBA from Bombay University and a Bachelor’s Degree in Mechanical Engineering from Manipal Institute of Technology.
Published at DZone with permission of Francesca Krihely, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
RBAC With API Gateway and Open Policy Agent (OPA)
Decoding ChatGPT: The Concerns We All Should Be Aware Of
Knowing and Valuing Apache Kafka’s ISR (In-Sync Replicas)
Superior Stream Processing: Apache Flink's Impact on Data Lakehouse Architecture