Last week, I attended a talk at the Washington DC MongoDB User Group given by John Schulz, a chief architect at AOL. In the talk, TokuMX VS MongoDB Bake Off Based on a Primary AOL Use case, he describes his experiments comparing TokuMX with MongoDB for his use case.
The experiments show TokuMX in a very favorable light. What I found interesting was why. His application went from an I/O bound workload under MongoDB (page 18) to a CPU bound workload under TokuMX (page 19). CPU is cheaper than I/O, hence better performance. But why were IOPs drastically reduced?
At a high level, databases really do only two things:
TokuMX’s Fractal Tree indexes help reduce I/O for writes, but don’t help with reads. Turns out, a big benefit here was using TokuMX’s clustering indexes. The use of clustering indexes reduced the IOPs for reads, hence making the workload under TokuMX a CPU bound workload.
In my post introducing clustering indexes, I mention that clustering indexes have a basic tradeoff: use extra space and have higher write cost in exchange for gains in read performance. I reason that the downsides are mitigated by the strength of Fractal Tree indexes: write performance and compression. This experiment seems like a nice example. Whether MongoDB or TokuMX is a better fit for other experiments will of course depend on the specific use case.