Redis and Redis clustering work very differently from other data stores and data store clusters. The differences are not always as obvious and may come up as realizations down the line while using Redis — like what happened in our case. We are using a Redis cluster, with which, fortunately, we have not faced many issues so far. But that does not mean we will not and we shall need to be prepared.
Recently we were working on getting a Redis cluster up and working with Docker Compose and were enlightened to some of the differences which later led to disillusionment for me. I thought that there should be a 'document of facts' on Redis and Redis cluster, which people/myself can refer to. So I decided to create one. Enjoy:
Redis is great as a single server.
In a Redis cluster, all your masters behave as if they are simultaneously active (not sure if they all are masters at the same time technically, but they behave as such).
Every master in a cluster knows every other master/node in the cluster.
There is no single master looking over the orchestration job.
The masters, during clustering (sharding) agree upon the division of load: who shall have which hash slots.
Each master speaks only for itself. If you ask for a key, and if the hash-slot for the same happens to be on the master you asked, it will return a value. Otherwise it returns a ‘redirection’ to the master that has the slot for this key.
It is then the client’s job to resend the request to this new master based on the redirection.
Clients try to sync up with the master for which hash-slots lie with which master in order to speed up the retrieval.
Every master knows other masters by IP and IP only. It is not possible to use a hostname.
The knowledge about the other nodes in the cluster is stored in a file called: nodes.conf. Although the extension gives an impression of user modifiable configuration file, it is not a file for humans to modify.
Every master must know other masters by actual public IP, it is not possible to use a loopback (like 127.0.0.1). If you do that, it ends up in a max-redirection error. How it works is, when a client asks for a key and server responds with a redirection, the ‘smart’ client is expected to follow this redirection and get the value from this other node. Now, the ‘dumb’ server responds with only the IP it knows other node by — that is your loopback on the Redis server. But this ‘smart’ client (Jedis) is not smart enough to understand that the loopback is actually of the node and apparently starts looking for a Redis node on its own host! Whatever. Just avoid doing that.
When two nodes meet to form a cluster, one of them has to forgo its data. Either one must be empty.
Replicas are not within masters, or any other nodes for that matter. Unlike what we know about clustering in Elasticsearch or Kafka-like services, replicas in Redis are independent nodes. So if you want a replication factor of two and have three masters, you effectively need 3 * 2 + 3 = 9 nodes in the cluster.
If a master drops off, it is not possible to bring it back into a cluster with data. This is an implication of point 12.
If you need to perform any updates to any of the nodes/servers, take points 12 and 14 into consideration. Take out the master, upgrade, flush, and reconnect as a slave. That is how it works.
Converting a single server to a cluster is not supported officially. There is one blog of a smart person showing a workaround for such a migration. The inverse of this, cluster to single server, shall be equally painful.
Redis/Redis Clustering is not officially supported on a Windows machine. There are unofficial ways to achieve something of the sort, such as MSOpenTech’s Redis implementation, which now also supports clusters.
The Java client, Jedis, has two different classes, one for connecting to a single standalone (JedisClient) and other for connecting to a cluster (JedisClusterClient). So if you decide to use the cluster in production, you cannot choose to use a single server during development. The implication is unnecessary load on your laptops. It can be managed by using environment-aware wiring. We worked around by creating a jar, with a class that, upon post-construct, just replaces the cluster-client reference of our internal cache utility class with a single server jedis-client. Just placing this jar on classpath during development solves it for us.
Running Redis cluster in Docker has its own pain points, on that later. (There should be one for Docker too.)
Extending point 11 (every master must know other masters via IP), if you have two network interfaces on the nodes and have two isolated networks for two services that use this Redis cluster, how will that work out? Such a setup is not unexpected in a Docker compose, where we isolate the service into different networks. Will need to see how Redis behaves in such a setup.
Although it was not the intention, while reading what I wrote I realized that the points above do look like a rant. In spite of these, Redis is a solid, fast cache store, and I love it for that. These are merely a few nuisances and related implications which we learnt about and experienced in our use of Redis cluster. Please use them only as points to ponder on when designing your application. Also, these nuisances are based on the state of Redis and Redis cluster at the time of writing which will change in time to come.