Flexible MariaDB Server Query Cache
In this article, explore the MariaDB Server query cache.
The MariaDB Server Query Cache
The Server Query Cache (QC) is a well-known feature of MariaDB Server-it caches SQL statements and the corresponding result sets. If a subsequent query then matches a cached SQL statement, the query can reuse the result set. The Query Cache also knows to invalidate the Query Cache if a table corresponding to a cached query is modified, for example by an UPDATE. For more information, see Query Cache in the MariaDB Knowledge Base.
In previous decades, amounts of RAM were limited, disks were slow, and CPUs had one core and one thread. If you wanted more cores, the cost was restrictive-up to several years' salary for two CPU machines. In those days, the Query Cache was a good idea. The likelihood of data being in limited-size disk caches was small. Getting data from disk was a bad idea as it meant dealing with the slow performance of magnetic media. Today we have faster, solid-state disks and more RAM, so the Query Cache is less important. Reducing reliance on cache alleviates its negative effects, such as mutex blocking.
A mutual exclusion object (mutex) is a programming object that allows multiple program threads to share a resource (such as a folder) but not simultaneously. Mutex is set to unlock when the data is no longer needed or when a routine is finished. Mutex creates a bottleneck effect. The blocking means only one query can look at the Query Cache at a time and other queries must wait. A query that must wait to look in the cache only to find it isn't in the cache will be slowed instead of being accelerated.
You might also be interested in: Improve MariaDB Performance Using Query Profiling
Query Cache Flags
The Query Cache has three modes:
OFF mode nothing will be cached. In either
DEMAND modes, flags can be used to determine which statements are cached.
DEMAND mode, only specified statements are cached. To specify that a statement should be cached, the
SQL_CACHE flag is set in a
SELECT statement caches only the SQL text and corresponding result of set for the specified statements. Three issues emerge with this assumption:
- The application needs modification to any applicable
SELECTto use this flag.
- When the
SQL_CACHEis missing in the
SELECTstatement, it will not be eligible for inclusion in the cache and doesn't check for any previous cached statements; therefore, it will never use the Query Cache.
- The DML, such as
DELETEstatements, need to be inspected for invalidation, which will incur in the blocking effects of the Query Cache mutex.
If the Query Cache is
ON, you can avoid a statement from being cached or looked up in the cache, by including the
Query Cache Concerns
Remember, in the previous decades there were only one or two simultaneous threads. When you had a single CPU with a single thread, Query Cache mutex wasn't a big deal because there was little chance another thread would run concurrently. Query Cache mutex wasn't regarded as a bad thing. You just noticed that when Query Cache was switched on, your website was suddenly running ten times faster.
Today if you run modern hardware with 16 threads, even without Query Cache, your website runs hundreds of times faster. If Query Cache were switched on, the mutex slows it down to a rate comparable to previous decades. Ugh.
The MaxScale Query Filter
MariaDB MaxScale database proxy provides another Query Cache as part of the cache filter. See Cache in the MariaDB Knowledge Base. The cache filter overcomes the limitations of the Query Cache inside MariaDB Server, but it adds a few obstacles of its own.
MaxScale cache filter eliminates the mutex issues; and it's more configurable compared to the MariaDB Server Query Cache. You can specify which tables, databases, columns, etc. are affected with a flexible set of rules.
Having the option to specify tables is useful if we don't want all the tables cached, but it makes the cache filter more difficult to configure. With complexity such as this, how does invalidation work in the query filter?
We could check all DML processed by MariaDB MaxScale and then invalidate the relevant tables. Using this approach will not work because any processing inside the MariaDB Server that modifies a table will not be seen by MaxScale, which includes stored procedures, functions, and triggers.
When using the cache filter each statement has time to live (TTL). When the statement times out, the cache needs to be refreshed. The tuning is difficult because we want to cache as long as possible, so we don't have to revisit the server to refresh, but that means the data being returned may be outdated.
The query filter has an advanced runtime configuration to deal with tuning issues. MariaDB MaxScale is useful, if you want to make this effort, but it's more difficult to set up and configure than using Query Cache in MariaDB Server.
Mutex Work Around
Attempts have been made to find a way to work around mutex in the Query Cache. Unfortunately, because mutex is central to internal processing, it's still not solved. What if we could at least specify which tables and SQL statements cached in the MariaDB Server Query Cache? There would still be an issue with cache invalidation statements, but at least the cache focuses on tables where it makes sense to use it, such as tables with mostly static content and where the SELECTS are frequent (for example products on a website).
Selecting from the products table would be speedy because we can cache the result for many queries. We would not see much cache invalidation, as the set of products rarely changes. Can we fix that with what is available right now? Yes, let's have a look.
Fixing the Query Cache
When the Query Cache type is set to
SELECT statements will use the Query Cache when explicitly specified by the
SQL_CACHE flag. This is done in the MariaDB Server Configuration File, /etc/my.cnf.d/server.cnf in the [mysqld] section, where we add:
Now, we don't want to rewrite all our applications to selectively use the
SQL_CACHE flag-we'll use MariaDB MaxScale to do that for us.
The regex filter is used this time, as it's not a query filter. The regex filter is intuitive and allows a search and replace of SQL statements being processed. In this instance,
SQL_CACHE is added to the selected statements. The examples below show a simple website table with two tables ( products and customers) that are hit all the time.
1. We can configure the MaxScale regex filter to insert SQL_CACHE into any statement, such as:
2. This configuration for the regex filter uses the PCRE2 syntax for matching:
PCRE2 regular expression syntax is intricate, yet powerful. Below is an element by element explanation of the above regular expression:
3. The following configuration settings tell the MariaDB MaxScale router to use the filter created:
4. With configuration in place, restart MariaDB Server and MariaDB MaxScale. We're now ready to test.
Testing the Fixed Query Cache
To test the configured Query Cache, connect to MariaDB through the MariaDB MaxScale port of your choice (this example uses port 4004).
1. Connect to the database.
2. Review what is currently in the Query Cache.
3. Note the tables order and order_line should not be cached unlike tables products and customers.
Did the tables hit the Query Cache? Review the following and notice nothing was inserted into the Query Cache, and there were no hits.
Next, let's check a query on the products table.
We can see the products within the table. Let's have a look at the status of the Query Cache again.
An insert was made into the Query Cache. When we run the same query again will we have the same result?
It worked the way it was supposed to. There is one insert into the Query Cache and one hit.
Next, let's try to invalidate the cache.
The number of queries in the cache is 0 because the cache was invalidated. Let's try one more time to invalidate the cache.
The statement was inserted into the cache as expected (
Qcache_inserts is now 2). The same statement was inserted twice, and the result set was invalidated in between the two executions.