Over a million developers have joined DZone.

Testing YCSB Workload E for Query Performance

DZone's Guide to

Testing YCSB Workload E for Query Performance

This set of benchmarks pits Couchbase 4.5 against MongoDB 3.2 to see which one can handle range scans more efficiently.

· Database Zone
Free Resource

Check out the IT Market Clock report for recommendations on how to consolidate and replace legacy databases. Brought to you in partnership with MariaDB.

There are many benchmark reports on YCSB out there, but only a few run Workload E with good hygiene. Most of what I see is done by customizing the workload from its original intent (95% query and 5% insert) to some other distribution. Recently Avalon published a report that compares MongoDB 3.2 and Couchbase Server 4.5. One important note in the benchmark is that it uses the most popular YCSB repo — so the client-side code uses what's publicly available, reproducible and it has not modified the original benchmark definitions in any way. 

The benchmark environment is nine nodes of r3.8xlarges (32 cores and 60GB RAM with SSDs) with 150 million items in the database. Workload E originally simulates a query load that simulates threaded conversations in communication apps. Queries run a "range scan" with "order by" and "limit 50". 

Image title

The results show that MongoDB can provide more than 8,000 queries/sec while Couchbase Server delivers more than 30,000 queries/sec on the same nine-node AWS deployment. 

There are a few reasons why the difference exists on the query workload, but I think global and local indexes are responsible for a majority of the difference. Here is what is boils down to: Couchbase Server query execution does not have to scatter gather with its global indexes while MongoDB has to reach across the entire cluster due to local indexes

It is really laws of physics or law of networks: When a query runs like the one in Workload E, MongoDB has to send the query to every node in the system and get a "top 50" computed on each node. For the nine nodes, that means 450 items (9*50 items) travelling back to a single node that eventually does a merge-sort that finds the final "top 50."

In the case of a query execution with global index in Couchbase Server, we find that there is only a single hop to the node that contains the index. Range scan with "order by" and "limit 50" gets pushed down so only 50 items are handed back to N1QL Query engine to deliver the results. 

The issue with local indexes and scatter gather does not end here. The more nodes you add, beyond the nine nodes in this test, the worse things get. Given the query in Workload E (or any other range scan), each added node has to process their own "top 50." The network gets saturated with node-count*50 results travelling for each query. Given the query throughput at tens of thousands, life gets worse the more nodes there are in the mix. You can see the impact of network saturation in these results here comparison the throughput vs the number of client threads generating the load.

Image title

I'd encourage everyone to dig deep into the numbers and draw their own conclusions. You can find the full disclosure report here

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

database ,benchmarking ,mongo db ,couchbase ,ycsb

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}