Working With Sorted Sets in Redis
Working With Sorted Sets in Redis
Using sorted sets allows you to easily increment views or comments, which are then stored in a sorted format so that it's cheap and quick to get the data when we need it.
Join the DZone community and get the full member experience.Join For Free
New whitepaper: Database DevOps – 6 Tips for Achieving Continuous Delivery. Discover 6 tips for continuous delivery with Database DevOps in this new whitepaper from Redgate. In 9 pages, it covers version control for databases and configurations, branching and testing, automation, using NuGet packages, and advice for how to start a pioneering Database DevOps project. Also includes further research on the industry-wide state of Database DevOps, how application and database development compare, plus practical steps for bringing DevOps to your database. Read it now free.
I work with a bunch of datastores and I probably shouldn't have favorites — but if I did, Redis would be one of them! I used the sorted sets for something I was building and remembered how much I liked the feature so here's a quick primer on what it is and when it can be handy.
Use Cases for Redis Sorted Sets
I'm a web developer and the sorted sets feature helps so much with those "most viewed," "most popular," and "most commented" lists that are so ubiquitous in sidebars and so on on "social" sites. The problem is that the query to get all the items, work out how many views or comments they all had, and rank them can be rather expensive especially if it's on every page. Using sorted sets allows you to easily increment the count of views or comments, and these are then stored in an already-sorted format so that it's very cheap and quick to get that data when we need it. Redis can be configured to be more or less persistent and is usually used in the blazing-fast-but-may-lose-data style. Does it matter if a few views don't get counted? In most cases, it doesn't which is one reason why this is such an elegant solution.
Store a Set of Data
All of Redis' commands have a prefix to indicate which data type is in use and for sorted sets, this is a
z. When we want to add a count to an item, we use
zincrby [set_name] [amount] [key]
So for an example where on a shopping site the most-viewed products should be listed, I would record a view by using this command on the page that displays the product:
zincrby product_views 1 hat
product_views set doesn't exist, Redis will silently create it. Similarly, if the key doesn't already exist, that will be created with a score of zero and then incremented appropriately. Add this operation into your application at the point where the thing you want to count happens, bearing in mind that it may make more sense to use a key that's a primary key for the item! You'll want to look up the product details to display this list (to go faster: cache product details in Redis, too).
Fetching the Most-Viewed Items
Now we've quickly incremented values each time we viewed or commented on our items, we can swiftly retrieve the ones that have got the most votes. The key here is that Redis stores the data already sorted, so there is no overhead of getting things in the right order when you request the data and this makes for speedy performance.
The command we want is
z is because it's a sorted set, the
rev is because we want the results backward, with the highest-scoring item first, and the
range is to get some or all of the elements from the set.
To get the three highest-scoring elements:
zrevrange product_views 0 2
The two extra arguments are start-and-stop, so this command returns the items in index 0, 1, and 2: three in total. This command also takes an optional final argument
WITHSCORES which will return two elements for each item: the key and also the score.
When The Data Isn't There
When the system has just started up, or if you periodically empty your most-viewed lists, then sometimes there won't be enough items in this set (or there could be none at all!). Essentially this is a "cache miss" - we went to grab something and it wasn't there. To implement a solution like this one, it's good to have a fallback option for when the data isn't currently available. For example you might just run a query to pick three random products from the database, or use three featured products instead. Bear in mind this possible outcome and remember always to test your systems with an empty Redis once in a while!
Are you solving this pattern a different way, or have a tip to share that I didn't mention? I'd love to hear from you in the comments!
Published at DZone with permission of Lorna Mitchell , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.