Over a million developers have joined DZone.

ALTER TABLE Performance with Fast Index Creation

DZone's Guide to

ALTER TABLE Performance with Fast Index Creation

· Performance Zone ·
Free Resource

SignalFx is the only real-time cloud monitoring platform for infrastructure, microservices, and applications. The platform collects metrics and traces across every component in your cloud environment, replacing traditional point tools with a single integrated solution that works across the stack.

Today I was looking at the ALTER TABLE performance with fast index creation and without it with different buffer pool sizes. Results are pretty interesting. I used modified Sysbench table for these tests because original table as initially created only has index on column K which initially contains only zeros, which means index is very small and also very fast to build by insertion as insertions happen in the “end” of index tree. I’ve updated column to have bunch of long random strings update sbtest set c=concat(sha1(rand()),’-',sha1(rand()),’-',sha1(rand()),’-',sha1(rand()),’-',sha1(rand())); and added key on column C: alter table sbtest add key c(c); The box I’m using for test is rather old box with 8GB of RAM and 4*SATA disks in RAID10. I used 10mil row table which would look as following in terms of data and index size:

mysql> show table status like "sbtest" \G
*************************** 1. row ***************************
           Name: sbtest
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 10000060
 Avg_row_length: 224
    Data_length: 2247098368
Max_data_length: 0
   Index_length: 1460322304
      Data_free: 7340032
 Auto_increment: 10000001
    Create_time: 2012-06-27 13:04:56
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
1 row in set (0.00 sec)

So we have about 3.7GB total size out of which 1.4GB is the Index.

First Lets look at results of Adding and Dropping column for 128M buffer pool, sized to illustrate the case of index being significantly larger than amount of memory.

Blue Columns in this case is the default configuration, second is with expand_fast_index_creation=1 which builds table with primary key only when creating a table and when builds all indexes by sort. The speedup we can observe is over 20 times. Note this benefit only applies to the case when you’re adding/dropping/changing columns as if you just add index it will be done by sort even without this additional Percona Server feature.

Lets now see with 6GB buffer pool which is large enough to fit complete table with index (but is a bit too small to keep 2 copies) In this case we’ll see the following picture:

Comparing these two graphs we can see few different observations:

Buffer Pool Size Matters There is 10x difference between the case when index fits in memory to when it does not. The difference can be even larger in the real world. Working with indexing, such as inserting data in random order can become very slow when index does not fit to memory any more.

Creating Index by sort is great Creating index by sort can provide over 20x performance improvement to the speed of ALTER TABLE, as this is not only step which ALTER TABLE does the improvement to the index creation speed itself has to be even larger. It is also offers better performance for in memory case, even though in this case the gain is just about 2x for this workload. Still substantial.

Sorting needs OS Cache It is not clearly seen from the graphs but with index built by sort enabled the performance of ALTER TABLE with 128MB was better than with 6GB. It looks like this does not make sense but in fact it does if you think about building index stage. On this box with 8GB of memory allocating 6GB to buffer pool made very little memory available for OS cache, so index build process required more IO than if only 128MB were used for buffer pool. The difference was rather small – less than 10% though, which is because disks are not that bad with sequential IO, which is what file merge process uses. This is important illustration as in many cases when people see performance reduction with large buffer pool they forget what bottleneck might be somewhere else where memory resources might be just needed more.

Take Away: If you’re having large tables and need to run ALTER TABLE which rebuilds the table or OPTIMIZE TABLE do not forget to enable expand_fast_index_creation it can make a whole process a lot faster.

P.S The tests were done with Percona Server 5.5.24-26.0



SignalFx is built on a massively scalable streaming architecture that applies advanced predictive analytics for real-time problem detection. With its NoSample™ distributed tracing capabilities, SignalFx reliably monitors all transactions across microservices, accurately identifying all anomalies. And through data-science-powered directed troubleshooting SignalFx guides the operator to find the root cause of issues in seconds.


Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}