Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

ANALYZE TABLE Is No Longer a Blocking Operation

DZone's Guide to

ANALYZE TABLE Is No Longer a Blocking Operation

In this post, I'll discuss the fix for lp:1704195 (migrated to PS-2503), which prevents ANALYZE TABLE from blocking all subsequent queries on the same table.

· Database Zone ·
Free Resource

Compliant Database DevOps and the role of DevSecOps DevOps is becoming the new normal in application development, and DevSecOps is now entering the picture. By balancing the desire to release code faster with the need for the same code to be secure, it addresses increasing demands for data privacy. But what about the database? How can databases be included in both DevOps and DevSecOps? What additional measures should be considered to achieve truly compliant database DevOps? This whitepaper provides a valuable insight. Get the whitepaper

In November 2017, Percona released a fix for lp:1704195 (migrated to PS-2503) created by Laurynas Biveinis. The fix, included with Percona Server for MySQL since versions 5.6.38-83.0 and 5.7.20-18, stops ANALYZE TABLE from invalidating query and table definition cache content for supported storage engines (InnoDB, TokuDB, and MyRocks).

Why Is This Important?

In short, it is now safe to run ANALYZE TABLE in production environments because it won't trigger a situation where all queries on the same table stack are in the state "Waiting for table flush." Check this post for details on how this situation can happen.

Why Do We Need to Run ANALYZE TABLE?

When Optimizer decides which index to choose to resolve the query, it uses statistics stored for this table by storage engine. If the statistics are not up to date, Optimizer might choose the wrong index when it creates the query execution plan. This can cause performance to suffer.

To prevent this, storage engines support automatic and manual statistics updates. While automatic statistics updates usually work fine, there are cases when they do not do their job properly.

For example, InnoDB uses 20 sample 16K pages when it updates persistent statistics, and eight 16K pages when it updates transient statistics. If your data distribution is even, it does not matter how big your table is — even for 1T tables, using a sample of 320K is enough. But if your data isn't even, statistics might get wrongly created. The solution for this issue is to increase either the innodb_stats_transient_sample_pages or innodb_stats_persistent_sample_pagesvariable. But increasing the number of pages to examine while collecting statistics leads to longer update runs, and thus higher IO activity, which is probably not what you want to happen often.

To control this, you can disable automatic statistics updates for such tables, and schedule a job that periodically runs ANALYZE TABLE.

Will it be safe before the fix for lp:1704195 (migrated to PS-2503)?

Theoretically, yes, but we could easily hit a situation as described in this post by Miguel Angel Nieto. The article describes what if some long-running query starts and doesn't finish before ANALYZE TABLE. All the queries on the analyzing table get stuck in the state "Waiting for table flush" at some time.

This happens because before the fix, worked as follows:

  1. Updates table statistics: Concurrent DML operations (INSERT/UPDATE/DELETE/SELECT) are allowed.
  2. Update finished.
  3. Invalidates table entry in the table definition cache: Concurrent DML operations are forbidden.
    1. What happens here is that ANALYZE TABLE marks the currently open table share instances as invalid. This does not affect running queries — they will complete as usual. But all incoming queries will not start until they can re-open table share instance. And this will not happen until all currently running queries complete.
  4. Invalidates query cache: Concurrent DML operations are forbidden.

The last two operations are usually fast, but they cannot finish if another query touched either the table share instance or acquired query cache mutex. And, in its turn, it cannot allow for incoming queries to start.

However, ANALYZE TABLE modifies table statistics, not table definition!

Practically, it cannot affect already running queries in any way. If a query started before ANALYZE TABLE finished updating statistics, it uses old statistics. ANALYZE TABLE does not affect data in the table. Thus, old entries in the query cache will still be correct. It hasn't changed the definition of the table. Therefore, there is no need to remove it from the table definition cache. As a result, we avoid Operations 4 and 5 above.

The fix for lp:1704195 (migrated to PS-2503) removes these additional updates and locks required for them, and makes ANALYZE TABLE always safe to run in busy production environments.

Compliant Database DevOps and the role of DevSecOps DevOps is becoming the new normal in application development, and DevSecOps is now entering the picture. By balancing the desire to release code faster with the need for the same code to be secure, it addresses increasing demands for data privacy. But what about the database? How can databases be included in both DevOps and DevSecOps? What additional measures should be considered to achieve truly compliant database DevOps? This whitepaper provides a valuable insight. Get the whitepaper

Topics:
database ,analyze table ,queries ,percona ,table flush ,optimization

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}