Over a million developers have joined DZone.

Defending the Cassandra Benchmark: What it Means to Compare NoSQL Performance

· Java Zone

Navigate the Maze of the End-User Experience and pick up this APM Essential guide, brought to you in partnership with CA Technologies

You may have heard about Jonathan Ellis criticizing Thumbtack Technology's NoSQL benchmarks on the DataStax blog - in short, he suggested that the benchmarks were improperly configured and failed to give Cassandra's performance the recognition it deserved. Well, Ben Engber at Thumbtack Technology heard about the criticism, too, and according to his recent response, Ellis is way off the mark.

Engber acknowledges that some of Ellis' basic points are valid - not every test was configured identically for each database - but different configurations, Engber argues, are necessary:

. . . Cassandra, Couchbase, Aerospike, and MongoDB are architected very differently.  This is a pretty complex discussion, and we discussed the durability question explicitly in the second part of our study.  Fundamentally, these databases work in different ways and are optimized for different things.  The trick was to create a baseline that compares them in a useful way.

From there, Engber explains the set-up of the benchmarks and what, exactly, they aimed to measure. More interesting, though, is his higher-level discussion of benchmarks and their purpose:

Let’s take a step back to why we would want to run a benchmark in the first place.  A benchmark is a synthetic thing, in a controlled environment, using a specialized and artificially designed workload.  The only reason to do such a thing is if we hope to learn something about the real world.

And according to Engber, they did find useful data relevant to the real world; their benchmarks were not invalid. Engber acknowledges that they were requested by Aerospike, and even acknowledges a couple of flaws pointed out by Ellis - one, Engber says, is mostly a failure of documentation, while the other is a valid (but negligible) omission - but his overall point remains that the differing configurations were not a fudging of the rules or a bias, but a necessary concession to create a meaningful baseline for comparison.

It comes down, as it so often does, to the complexity of these technologies: another benchmark configured a different way (aiming for a different baseline) may have totally different results, depending on the strengths and weaknesses of each database. That's been a common sentiment for a while now: your database is a tool, so you need to understand its purpose, and use it only when it is appropriate.

In other words, it may be wise to take every benchmark with a grain of salt, or, more accurately, be precise in your understanding of what they are measuring.

Thrive in the application economy with an APM model that is strategic. Be E.P.I.C. with CA APM.  Brought to you in partnership with CA Technologies.


Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}