Load Management Techniques for MySQL
Join the DZone community and get the full member experience.
Join For FreeThe first thing you need to know it is that this is not a MySQL problem. It might not be a problem with your MySQL configuration, queries and hardware, even though fixing these does help in many cases. Whatever powerful and well tuned system you have, if you put a concurrent load on it that is too heavy the response times will increase and user experience will suffer.
So what you can do to prevent this problem from happening ? The answer is easy. Throttle the side load so it does not consume too many system resources. Here are some specific techniques to use.
Do not push concurrency too high:
Many developers will
test scripts with multiple levels of concurrency and find out doing work
from 32 processes is faster than just having one process. This is true
if you have the system completely at your disposal. However, if you need the
system to serve other users too you typically need to reduce concurrency
to where it does not overload the system. Unless it is a really time
critical process I would not use more than 4 parallel processes heavily
writing to the database.
Introduce Throttling:
Sometimes even a single process
overloads system too much. In this case throttling by having relatively
short queries and introducing “sleeps” between them can be a good idea.
It also often helps with monopolizing replication threads. For example
if I need to delete old data instead of DELETE FROM TBL WHERE ts<"2010-01-01" I’ll do “DELETE FROM TBL WHERE TS<"2010-01-01" LIMIT 1000
in the loop until no more rows need to be deleted. I may inject
“sleep” between iterations that are as long as query execution – so
the longer queries run and the more the system is loaded the more “rest”
it will get. Alternatively you can look at “threads_running”
variable which is a very good simple identifier of the current load and
sleep based on its value – for example you may want chose to pause the
script if the load is too high and wait for threads_running to go
below certain value.
Tuning Cron:
It also often helps to look into your cron
or other scheduling system you’re using. Frequently way too many
scripts can be started at once, or very close to each other so they
start to overlap and produce the overload. Solutions could be
spacing them out, introducing some “job control” to ensure scripts do
not run in parallel if they should not (so you don't get
many copies of same script running at once). One simple solution is
instead of having a bunch of scripts scheduled at midnight, 1AM, 2AM to
start I can put them into nightly.sh one after another and schedule that
to run at midnight – this way I get scripts to run one after another at
their own pace.
Dedicated Slave:
I remember listening to Cary
Millsap’s talk once and he recommended moving the load in time and space
as an optimization technique. We spoke about moving load in time before,
but we also can move in space – putting it on a different system,
which in MySQL space is most commonly a dedicated slave. In a lot of
environments, especially with a low level of operational/development
discipline to enforce previous solutions, it can be a life saver. Of
course, it only works for read jobs, which is an important limitation.
Getting slave(s) for batch jobs also can help in other ways too – such
as competition for buffer pool between different kinds of workloads is
reduced.
innodb_old_blocks_time:
Surprisingly simple but effective, setting innodb_old_blocks_time=1000
can often be very helpful in avoiding batch jobs washing away buffer
pool contents and so making normal user queries a lot more disk bound
and slower. I wrote about it in more detail few months ago.
Finally lets touch upon the discovery question. To deal with load management you need to understand when the problem is happening in your environment (we want to catch it before users complain right?) and if it does what jobs exactly cause the overload. In complex environments it might be a hard question. pt-stalk is a great tool for this purpose. Getting it running can help you to collect the state of your system when it was overloaded with side load. Analyzing the wealth of data it generates will most likely contain the answers you’re looking for.
Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments