Redis Multi Key Operations On A Sharded Cluster

Planning on relying on sharding Redis to scale horizontally and save money?
Then you better make sure your “multi key” operations will work right with it.
I have a handful of client projects that are digging in deep with sharded redis clusters right now and one thing that I have to constantly design around is “Multi Key” commands.
What Are “Multi Key” Commands?
This is any command that examines the contents of 2 separate keys.
A simple example that anyone that knows how arrays work in javascript or more popular programming languages is an union”. In javascript it is similar to the “concat”. You give the arguments 2 or more keys and it returns the combined values of those keys.
The problem is what if those keys are on different shards? Due to the nature of sharding that means that the request would have to cross shards and therefore be extremely inefficient.
Sharding only works when you make the request you already know which shard to send it to.
Use Cases:
Luckily for me I don’t have a ton of use cases for union at the minute but its sibling multi-key command “Intersect AKA SINTER” I use all the time. Mainly for the massive search engines I am tasked with building.
The basic use case is as follows: A user comes to my clients website and wants to search through their millions of products. The user wants all the shoes in “Size 12” and the “Color Red”.
We have a key in redis size:12
that contains the Product Ids of every size 12 shoe that can be searched. We also have a key color:red
that contains the Product Ids of every red shoe.
I want to get only the Product Ids that show up in both and the SINTER would be the perfect command to do that… unless we are on a sharded cluster and those two keys are on different shards.
Exceptions To This:
One interesting exception from the Redis Cluster Specifications says the following:
Commands performing complex multi-key operations like set unions and intersections are implemented for cases where all of the keys involved in the operation hash to the same slot.
By “slot: they mean “Shard”. I want to play with this in the near future.
This means if I were building an e-commerce store for clothes and we only ever let people search for one piece of clothing at a time(Ex: Just a shoe search, or just a pants search). We could manipulate the hash that determines which shard the item will be stored at to force all Shoes to be on Shard A and all pants to be on Shard B.
It somewhat defeats the purpose of hashed sharding which is intended to try to get the most balanced distribution of records across the shards. Especially if you had 9 million shoes and only 1 million pants. Then the Shoe’s shard would be doing 90% of the work and the Pant’s shard would barely be doing anything.
This analogy has gotten weird when talking about “Pant’s Sharding”. I am curious if Google will knock me in the SEO rankings for this post.
Wrapping It Up:
Are you using Redis’s Clustering/Sharding capabilities? What limitations have you run into?