Beyond Keys and Values: Structuring Data in Redis
In this guide, we focus on designing and choosing the right data structures for Redis to build an efficient, scalable and maintainable caching layer.
Join the DZone community and get the full member experience.
Join For FreeRedis is a well known, open source, in-memory data store. By design, it prioritizes speed, making reads exceptionally faster.
Most of us are familiar with various caching techniques such as Cache-Aside, Write-Through, Write-Behind, Read-Through etc.
Designing a caching layer is no piece of cake. It takes several iterations to get things working correctly.
Regardless of which strategy we choose, a well designed data structure is key to building a reliable and scalable architecture.
So, where do we spend most of our time when designing a caching layer?
In other words, what does the bigger picture look like
- What data should be cached?
- How much data should be cached?
- How long should the data remain cached?
- How can we maintain data integrity between the database and cache?
- How should we gracefully handle cache misses?
What are we going to focus on today? Let’s take a closer look at the finer details.
We will focus on designing Redis keys and choosing the right data structures to build an efficient, scalable, and maintainable caching layer.
We will dive into specific design considerations, including:
- Choosing the right Redis data structure (String, Hash, List, Set, RedisJSON, etc.)
- How to structure Redis keys effectively
- Evaluating memory efficiency, update frequency, and atomicity
- Supporting advanced operations like sorting, searching, and filtering
- Trade-offs between simplicity, flexibility, and performance
Choosing the Right Redis Data Structure
The choice of Redis data structures directly impacts the efficiency of cache, scalability and how well the system adapts to the changes in the application.
As engineers, we know we can make things work. But what sets a good engineer apart is how effective and efficient the solution is.
Redis supports a variety of data structures, including
- Strings
- Hash
- List
- Set
- Sorted Set
- RedisJson, etc
Factors Influencing the Choice of Data Structure
| Factor | Questions to Consider |
|---|---|
| Source Data | What is the structure of the raw data? |
| Destination Format | What does the API response look like?- Is the cache directly serving this response? |
| Variable Components | How often does the data change? Which parts of the data change? Do we support partial or atomic updates? |
| Required Operations | What operations are needed? Sorting? Searching? Filtering? |
How to Choose the Data Structure That Works
Where Does Your Cache Sit In the Architecture?
If the cache is sitting closer to the UI layer (e.g acting as a direct source for API response) , it makes sense to store the data in a format that closely matches the expected API response. This minimises transformation overhead and improves response time.
For Example . Consider an API which lists all the hotels at Barnfield area with information about the number of vacant rooms in each hotel
{
"Hotels": [
{
"Id": 101,
"Name": "Holiday Inn",
"Rooms": 15,
"Occupied": 10,
"Vacant": 5
},
{
"Id": 102,
"Name": "Seaside View Villa",
"Rooms": 10,
"Occupied": 9,
"Vacant": 1
},
{
"Id": 103,
"Name": "Greenwood Inn",
"Rooms": 20,
"Occupied": 9,
"Vacant": 11
}
]
}
Depending on the use case, Redis provides multiple ways to structure and store the data
Option 1: Cache the entire response as a single Key-Value Pair
Key -> “barnfield_hotels”
Value -> A list of hotel Objects
Pros: Simple to retrieve
Cons: Entire value must be updated even if one hotel value changes
Option 2: Cache each hotel value as a separate Key-Value Pair (String-JsonString)
Key -> “103”
Value -> ‘{\“Id\”: 103,\“Name\”: \“Greenwood Inn\”,\“Rooms\” : 20,\“Occupied\” : 9,\“Vacant\”: 11}’
Pros: Each hotel value can be updated individually, better control
Cons: No built-in support for partial updates
Option 3: Cache each hotel value as RedisJson
Key -> “103”
Value ->
{
"Id": 103,
"Name": "Greenwood Inn",
"Rooms": 20,
"Occupied": 9,
"Vacant": 11
}
Pros: More fine grained control, Supports partial updates, nested data structures, Querying
Cons: Requires RedisJson module, consumes little more memory
Option 4: Use Redis Hashes
Key -> “103”
Value -> Name "Greenwood Inn" Rooms 20 Occupied 9 Vacant 11
Pros: Memory Efficient, allows field level updates
Cons: Limited to flat data structures
Each data structure in Redis has its own advantage
- Storing as string probably takes the least memory. It is easy and simple for small payloads.
- Storing as a list can make the lookup easier, especially if the API does not have too many sorting or filtering requirements. A single operation can return the entire list.
- Storing as a JSON provides more flexibility. It allows partial updates, supports nested data structures and allows usage of powerful Query language
How Often Does the Data Get Updated? Is It Atomic? Or Partial?
Understanding how frequently the data changes and whether those updates are partial or atomic plays a key role in choosing the right data structure.
In this example, the properties “Occupied” and “Vacant” are likely to change frequently
Storing the entire JSON as a list would require
- To get the complete list
- Iterate through it and find the specific hotel
- Modify the relevant fields
- Store the entire list back
This is inefficient and error prone especially for high frequency updates.
Better Alternatives are:
1. Storing it as a Hash:
HSET hotel:101 Occupied 12 Vacant 3
Allows direct update of individual fields
Fast, memory efficient, ideal for flat structures
2. Using Redis JSON:
JSON.SET hotel:101 $.Occupied 12
JSON.SET hotel:101 $.Vacant 3
Supports partial updates
Powerful, flexible, great for nested couples structures
Think About the Memory Usage
In the above example, both RedisJson and Hash are good options, but for a simple flat data like the one shown above, Hashes are a better choice.
They are more memory efficient
They offer faster read/write operations
Operations on Data
Do you need to sort and filter the data frequently?
Option 1: Use Sorted Sets for Sorting
- Store the main data as a Hash
- Maintain a Sorted Set for each sortable field.
For eg.
Create a sorted set of hotel IDs based on room count
ZADD hotels_by_rooms 15 "101" 10 "102" 20 "103"
Get the sorted list of IDs (Keys) sorted by room count
ZRANGE hotels_by_rooms 0 -1 WITHSCORES
Do a look up using the IDs and get the actual data from the corresponding Hash
This is efficient for simple filters and sorting.
Option 2: Use RediSearch for advanced Queries
Use RediSearch, a powerful, high-performance module for querying, indexing, and full-text search
Use RediSearch, when the application requires
- Complex filtering and sorting logic
- Full Text Search
What’s Next?
Next, we will take a closer look at:
- Redis Key Designand Cache Invalidation — best practices for naming, organizing, and managing keys at scale
- Sorting, Filtering, and Searching — how to efficiently support these operations using Sorted Sets, RediSearch, and more
Stay tuned!
Opinions expressed by DZone contributors are their own.
Comments