Top Redis Headaches for DevOps: Replication Buffer
Join the DZone community and get the full member experience.Join For Free
originally written by yaron dolev
redis provides a wide variety of tools directed at improving and maintaining efficient in-memory database usage. while its unique data types and commands fine-tune databases to serve application requests without any additional processing at the application level, misconfiguration, or rather, using out-of-the-box configuration, can (and does) lead to operational challenges and performance issues. despite the setbacks that have been the cause of quite a few headaches, solutions do exist, and may be even simpler than anticipated.
this series of installments will highlight some of the most irritating issues that come up when using redis, along with tips on how to solve them. they are based on our real-life experience of running thousands of redis database instances.
the replication buffer limit
replication buffers are memory buffers that hold data while a slave redis server synchronizes with the master server. in a full master-slave synchronization, changes performed to the data during the initial phase of the synchronization are held in the replication buffer by the master server. after the completion of the initial phase, the contents of the buffer are sent to the slave. there is a limitation to the size of the buffer that can be used in this procedure, causing replication to start from the beginning when the maximum is reached, as mentioned in our post on endless redis replication loops . in order to prevent this from happening, an initial configuration of the buffer needs to take place according to the amount and types of changes expected to be made during the replication process. for example, a low volume of changes and/or smaller data in the changes can get by with a smaller buffer, whereas if there are a lot of changes and/or the changes are big, a large buffer is needed. a more comprehensive solution entails setting the buffers to a very high level to offset the possibility of a lengthy or heavy replication process that will eventually exhaust the buffer (if the latter is too small). ultimately, this solution requires fine-tuning the specific database at hand.
redis default setting:
> config get client-output-buffer-limit 1) "client-output-buffer-limit" 2) "normal 1073741824 536870912 30 slave 268435456 67108864 60 pubsub 33554432 8388608 60"
as explained here , this default configuration replication link will be broken (causing the synchronization to start from the beginning) once the 256mb hard limit is reached, or if a soft limit of 67mb is reached and held for a continuous 60 seconds. in many cases, especially with a high ‘write’ load and insufficient bandwidth to the slave server, the replication process will never finish. this can lead to an infinite loop situation where the master redis is constantly forking and snapshotting the entire dataset to disk, which can cause up to triple the amount of extra memory to be used together with a high rate of i/o operations. additionally, this infinite loop situation results in the slave never being able to catch up and fully synchronize with the master redis server.
a simple solution that offers an immediate improvement results from increasing the size of the output slave buffer by setting both the hard and soft limits to 512mb:
> config set client-output-buffer-limit "slave 536870912 536870912 0"
as with many reconfigurations, it is important to understand that:
- before increasing the size of the replication buffers you must make sure you have enough memory on your machine.
- the redis memory usage calculation does not take the replication buffer size into account.
that brings us to the end of our first installment of the top redis operational headaches. as pointed out above, in terms of replication buffer limits, proper configuration can go a long way. be sure to keep an eye out for the next post in this compilation covering replication timeouts and how to handle them accordingly.
continue reading our next post in this series: replication timeouts
Published at DZone with permission of Itamar Haber, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.