Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Dissecting the NoSQL Benchmark

DZone's Guide to

Dissecting the NoSQL Benchmark

· Performance Zone
Free Resource

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

Let's dissect the Fall 2014 NoSQL Benchmark.

Apache Cassandra / DataStax Enterprise. MongoDB. Couchbase Server.

Go.

Hardware


Network: 96Gb/s (1Gb/s Switch with 48 Ports)

Wait, 96Gb/s? It's a full duplex matrix. (link)

Why not benchmark with dozens of servers?

I prefer benchmarks performd on bare-metal server, on-premise servers. It's not cheap.

Data

Why not benchmark with terabytes of data?

The goal is to identify how well database perform and scale when the working sets in memory. The working set could have been 32GB, or it could have been 384GB. It wouldn't have mattered. It fit in memory. The working set could have been 10% of the data, or it could have been 90% of the data. It wouldn't have mattered. It fit in memory.

Consider customers streaming movies from Netflix. There are many, many customers. However, only a fraction of them are streaming movies at any given point in time. That's a working set. Consider web and mobile application users. I play Words with Friends with my wife. We play two to three games at a time. Every now and then I'll launch the app. I'll spend a few, if not several minutes, playing the right words to crush her. We're all about Amazon Prime. However, we only shop once in a while. When we do, we're part of a working set. Customers who are currently shopping. Not all customers. Then there is real time data. Consider click stream analysis and sensor data. The working set is users currently visiting the web site. Not all users. The working set is the last minute, the last hour, or the last day of sensor data. Not all sensor data.

While an in-memory working set improves read performance, it does not improve write performance.

All data was persistent.

Configuration

Consistency

Apache Cassandra, eventually consistent (ONE). MongoDB and Couchbase Server, consistent.

Durability / Persistence

Apache Cassandra and MongoDB and Couchbase Server, asynchronous persistence.

Note: The commit log was disable in Apache Cassandra.

Replication

There are two copies.

Topology

Apache Cassandra and Couchbase Server, one node per server. MongoDB, two nodes per server.

Note: MongoDB relies on a master / slave topology. To ensure every server responded to client read / write requests, one master and one slave (from different shards) were installed on every server.

Workloads

Read Heavy: 95% Reads / 5% Writes
Balanced: 50% Reads / 50% Writes

Database Nodes & Client Threads

The benchmarks were performed on deployments of four, six, and eight servers.

The benchmarks were performed with an increasing number of client threads.

Results

Read Heavy

Apache Cassandra

Throughput peaks between 248 and 496 client threads: 99K ops/s.
Adding nodes increases throughput.

Latency is never less than 1ms.
Adding nodes decreases latency.

MongoDB

Thoughput peaks between 132 and 198 client threads: 227K ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 231 client threads: 200K ops/s.
Adding nodes decreases latency.

Couchbase Server

Throughput peaks at 2,970 client threads: 1.62M ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 1,320 client threads: 1.44M ops/s.
Adding nodes decreases latency.

MongoDB v Couchbase Server

Let's compare the throughput and latency of MongoDB and Couchbase server under equivalent load.

Couchbase Server throughput increases under increasing load. MongoDB, not so much.

Couchbase Server latency is less than 0.6ms up to 264 client threads: 642K ops/s.
MongoDB is less than 1ms up to 231 client threads: 200K ops/s

Explanation

  • MongoDB read latency benefits from memory mapped files. However, memory mapped files do not perform as well as direct memory. That, and MongoDB relies on an archaic, complex topology. In fact, the benchmark compared MongoDB with 16 nodes to Couchbase Server with 8 nodes.

Apache Cassandra v Couchbase Server

Couchbase Server throughput increases under increasing load. Apache Cassandra, not so much.

Couchbase Server latency is less than 0.7ms up to 792 client threads: 1.22M ops/s.
Apache Cassandra latency is never less than 1ms.

Explanation

Apache Cassandra relies on a poor, disabled by default row cache (JIRA).

Balanced Read / Write

Apache Cassandra

Throughput peaks between 272 and 544 client threads: 89K ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 680 client threads: 79K ops/s.
Adding nodes decreases latency.

MongoDB

Throughput peaks at 256 client threads: 85K ops/s.
Adding nodes increases throughput.

Latency is never less than 1ms.
Adding nodes decreases latency.

Couchbase Server

Throughput peaks at 3,000 client threads: 1.18M ops/s.
Adding nodes increases throughput.

Latency is less than 1ms up to 480 client threads: 499K ops/s.
Adding nodes decreases latency.

MongoDB v Couchbase Server

Let's compare the throughput and latency of MongoDB and Couchbase Server under equivalent load.

Couchbase Server throughput increases under increasing load. MongoDB, not so much.

Couchbase Server latency is less than 0.8ms up to 360 client threads: 457K ops/s.
MongoDB latency is never less than 1ms.

Explanation

Well, a single lock doesn't help (manual | JIRA). Document level locking has been requested since 2010.

Apache Cassandra v Couchbase Server

Couchbase Server throughput increases under increasing load. Apache Cassandra, not so much.

Couchbase Server latency is less than 1ms up to 480 client threads: 499K ops/s.
Apache Cassandra latency is less than 1ms up to 680 client threads: 79K ops/s.

Explanation

Apache Cassandra write latency benefits from the memtable, but Couchbas Server was engineered for concurrency.

Conclusion

Questions?

Comment or join the discussion over on Reddit or HN.

You can download the white paper here.

Learn tips and best practices for optimizing your capacity management strategy with the Market Guide for Capacity Management, brought to you in partnership with BMC.

Topics:

Published at DZone with permission of Shane Johnson, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}