If you are running in a VM or a container, you get the following types of storage:
Network-attached durable storage. Even if your VM or container moves from one physical host to another, your drive is guaranteed to follow without losing committed data. This is typically what all databases use for data on disk. A downside is that it's network-attached and thus, a regular disk write is a network plus disk write.
Local ephemeral storage.This drive is local but it is "reset" every time the VM or container moves to another physical host. Thus, beyond some limited use, many disk-based databases waste this space. Redis with Redis Enterprise Flash has a unique way of utilizing the ephemeral drive instead of wasting it. Let's explore the architecture.
Redis and Redis Enterprise Architecture
Redis is a high-speed, in-memory database with a "structures"-based data model that lets you express your problem to the database with great flexibility (read more at Redis data types). Redis Enterprise is a distributed, highly available, and scalable database platform that is built on Redis. Redis Enterprise is 100% compatible with Redis, so just change your connection string and you are good to go.
Redis Enterprise scales Redis in a few ways. You can find those details here. I'll focus on how Redis Enterprise uses the local ephemeral drive to extend RAM to scale data size.
With Redis Enterprise Flash, Redis database allocates a memory quota that spans RAM plus the local ephemeral drive. Values stored in the database are spread over RAM and the ephemeral drive. Redis Enterprise Flash smartly places frequently accessed values in RAM and less frequently accessed values in the ephemeral drive. Obviously, to match RAM speeds, it is best to use SSD-based instances and ephemeral drives.
Redis Enterprise Flash is built for the ephemeral drive. If the VM and container move and the drive is an ephemeral drive reset, all is well. Redis Enterprise maintains a durable copy on-disk and a replicated copy on another node, and it can simply re-populate from these sources.
It is easy to setup a Redis Enterprise Cluster. You can find the instructions using Docker on macOS or Windows. On the last step, instead of creating a regular database, you can create a Flash-based database.
In my case, I am using a MacBook to run a three-node Redis Enterprise Pack cluster. I configured a simple 10GB quota with 1GB of RAM plus 9GB on the ephemeral drive — meaning I am only consuming 1GB RAM to store 10GB total data in Redis.
As I populate data (in this case, I have 1K value size), I end up with data first in RAM. As I run out of the first GB, I get additional data pushed to the ephemeral drive. In the picture below, the left-side stat shows the number of values in RAM and the right-side stat is the total count of values in flash over a minute.
For transparency, I use the memtier benchmark tool to run the data load with the following arguments:
./memtier_benchmark --pipeline=100 -n allkeys --ratio=1:0 --data-size=1024 --key-prefix A --key-minimum=1 --key-maximum=3000000 --key-pattern P:P -c 2 -t 2 -h 10.0.0.2 -p 12000
Many workloads we look at here at Redis Labs have a pattern that shows that not all keys and values get accessed with the same frequency. Most of them exhibit a "hot working set" that represents the more frequently accessed portion of data among all data. As it turns out, keeping the RAM-to-Flash ratio so that your "hot working set" fits in RAM can provide the best latency characteristics when it comes to using Flash. To help detect this ratio, Redis Enterprise Flash provides another stat: RAM hit ratio. This hit ratio represents the percentage of times the value accessed was found in RAM. This stat is similar to the buffer cache hit ratio you may be familiar with in disk-based databases. Keeping the value high keep latencies lower. Over time, however, the working set may change.
With the following graph, you can see the RAM hit ratio and latency. Please ignore the latency value simply because the test was run on a laptop that is overbooking CPU, running all three nodes and load generation under heavy paging. The general idea is there, however. The graph shows how easy it is to adjust the RAM-to-Flash ratio so you can get to lower latencies by simply allocating more RAM to your database.
The memtier benchmark options looked like this:
./memtier_benchmark --pipeline=100 -n allkeys --ratio=2:8 --data-size=1024 --key-prefix A --key-minimum=1000000 --key-pattern G:G --key-maximum=2000000 --key-stddev=180000 --distinct-client-seed --randomize -c 2 -t 2 -x 10
In the picture above, I first ran a steady workload over a set of keys that achieved roughly an 85% RAM-hit ratio. However, I want lower latencies. On the left graph, you see the rise in RAM-hit ratio. That's the point where I change my RAM size from 1GB to 2GBs. With the additional RAM, over time, more values were moved into RAM. To change the ratio, I simply change the database slider from 10% to 20% on the database configuration page in the UI. It takes a few seconds to settle but it is easy to see the trend in latencies falling as the RAM-hit ratio increases — with no downtime required!
There are a few other important reasons why the RAM plus ephemeral drive approach works well. The ephemeral drive contains only values that do not fit in RAM. That means that if I get hot values and keys that get repeated writes, there is no repeated IO to record these updates on the ephemeral drive. This reduces the IO to the drive and saves the IO bandwidth of the ephemeral drive for real RAM faults. Databases, in durable writes, maintain WAL (write ahead logs) or redo logs to protect against data loss. This causes write amplification, meaning that each value write ends up producing many additional writes as you maintain more structures like WALs. However, Redis Enterprise Flash does not suffer from this type of write amplification and does not need to maintain WAL.
Obviously, wasting your ephemeral drive is costly. In fact, it is 80% cheaper to use Redis Enterprise Flash in infrastructure costs (detailed cost comparisons can be found here).