Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

InnoDB Page Compression: The Good, the Bad, and the Ugly

DZone's Guide to

InnoDB Page Compression: The Good, the Bad, and the Ugly

Somebody recently asked me about the best way to handle JSON data compression in MySQL. I took a quick look at InnoDB page compression, and here, I share my findings.

· Database Zone ·
Free Resource

Databases are better when they can run themselves. CockroachDB is a SQL database that automates scaling and recovery. Check it out here.

In this blog post, we'll look at some of the facets of InnoDB page compression.

Somebody recently asked me about the best way to handle JSON data compression in MySQL. I took a quick look at InnoDB page compression and wanted to share my findings.

There is also some great material on this topic that was prepared and presented by Yura Sorokin at Percona Live Europe 2017. Yura also implemented Compressed Columns in Percona Server.

First, the Good Part

InnoDB page compression is actually really easy to use and provides a decent compression ratio. To use it, I just ran CREATETABLE commententry (...) COMPRESSION="zlib"; — and that's all. By the way, for my experiment, I used the subset of Reddit comments stored in JSON (described here).

This method got me a compressed table of 3.9GB. Compare this to 8.4GB for an uncompressed table and it's about a 2.15x compression ratio.

Now, the Bad Part

As InnoDB page compression uses "hole punching," the standard Linux utils do not always properly support files created this way. In fact, to see the size "3.9GB," I had to use du --block-size=1 tablespace_name.ibd, as the standard ls -ltablespace_name.ibd shows the wrong size (8.4GB). There is a similar limitation on copying files. The standard cpold_file new_file may not always work, and to be sure, I had to use cp --sparse=always old_file new_file.

Speaking of Copying, Here's the Ugly Part

The actual time to copy the sparse file was really bad.

On a fairly fast device (a Samsung SM863), copying the sparse file mentioned above in its compressed size of 3.9GB took 52 minutes! That's shocking, so let me repeat it again: 52 minutes to copy a 3.9GB file on an enterprise SATA SSD.

By comparison, copying regular 8.4GB file takes 9 seconds! Compare 9 sec and 52 mins.

To be fair, the NMVe device (Intel® SSD DC D3600) handles sparse files much better. It took only 12 seconds to copy the same sparse file on this device.

Having considered all this, it is hard to recommend that you use InnoDB page compression for serious production. Well, unless you power your database servers with NVMe storage.

For JSON data, the Compressed Columns in Percona Server for MySQL should work quite well using Dictionary to store JSON keys — give it a try!

Databases should be easy to deploy, easy to use, and easy to scale. If you agree, you should check out CockroachDB, a scalable SQL database built for businesses of every size. Check it out here. 

Topics:
database ,innodb ,page compression ,json ,mysql ,database performance ,data compression

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}