DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Benchmarking Cassandra: The Right & Wrong Way to Do it

Benchmarking Cassandra: The Right & Wrong Way to Do it

Alec Noller user avatar by
Alec Noller
·
Jun. 24, 14 · Interview
Like (1)
Save
Tweet
Share
10.61K Views

Join the DZone community and get the full member experience.

Join For Free

Everybody loves comparing databases. Not everybody agrees on how to do it, though. If you ask Jonathan Ellis at the DataStax Developer Blog, for example, one prime example is Thumbtack Technology's benchmarks comparing Cassandra, Couchbase, MongoDB, and Aerospike. The problem, Ellis says, is that the benchmarks give Cassandra a raw deal.

According to Ellis, the benchmarks were basically set up correctly, but ignored some major factors when it comes to benchmark hygiene:

Our problems start with benchmark hygiene: the read runs were run one after the other rather than properly isolating them by dropping the page cache and warming up each workload separately.  It also looks like no effort was made to isolate the effects of Cassandra compaction; compaction from the read/write workload could have continued into the read-heavy section. 

And those aren't even the biggest problems with the benchmarks, Ellis says. By Thumbtack Technology's numbers, Aerospike comes out on top and/or on par with Couchbase, while Cassandra trails behind, with MongoDB even further behind, and Ellis goes into detail for each aspect of the benchmark to explain what aspect of Cassandra was misunderstood or ignored.

To really nail down the argument, though, Ellis runs his own benchmarks. Due to changes in Aerospike's API, he couldn't include Aerospike in his new benchmarks, but instead substituted HBase as another representative of the top NoSQL solutions. His results came out like this:

(Source: Jonathan Ellis at DataStax)

It's an interesting look at the various factors one must consider when making performance comparisons, or any comparisons, given the complexity of these technologies.

The cynical might observe that benchmarks coming at the request of Aerospike (as Ellis notes) show Aerospike's excellent performance, while benchmarks coming from DataStax show Cassandra's excellent performance. The even-more-cynical might observe that both show MongoDB far below all the others - but hey, MongoDB's always being mistreated.

Check out the full article from Jonathan Ellis for all the details, and if you're looking for more in the way of Cassandra's performance, you might find something interesting here:

  • Tuning the JVM to Improve Performance in Cassandra 
  • Netflix Benchmarks on AWS Show Cassandra NoSQL Still Has the Goods

And more from Jonathan Ellis:

  • HBase vs. Cassandra


Aerospike (database) Database MongoDB Page cache Comparison (grammar) NoSQL Aspect (computer programming) Factor (programming language) Par (command)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 5 Factors When Selecting a Database
  • Key Considerations When Implementing Virtual Kubernetes Clusters
  • Top Three Docker Alternatives To Consider
  • Bye-Bye, Regular Dev [Comic]

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: