In this blog post, I’ll discuss changes I’ve made to the tpcc-mysql benchmark tool. These changes make it less random and support multi-schema.
This post might only be interesting to performance researchers. The tpcc-mysql benchmark to is what I use to test different hardware (as an example, see my previous post.
The first change is support for multiple schemas, rather than just one schema. Supporting only one schema creates too much internal locking in MySQL on the same rows or the same index. Locking is fine if we want to compare different MySQL server versions. But it limits comparing different hardware or Linux kernels. In this case, we want to push MySQL as much as possible to load the underlying components. One solution is to partition several tables, But since MySQL still does not support Foreign Keys over partitioning tables, we would need to remove Foreign Key as well. A better solution is using multiple schemas (which is sort of like artificial partitioning). I’ve implemented this updated in the latest code of tpcc-mysql.
The second change I proposed is replacing fully random text fields with generated text, something similar to what is used in the TPC-H benchmark. The problem with fully random strings is that they take a majority of the space in tpcc-mysql schemas, but they are aren’t at all compressible. This makes it is hard to use tpcc-mysql to compare compression methods in InnoDB (as well as different compression algorithms). This implementation is available in a different branch for now.
If you are using tpcc-mysql, please test these changes.