Consistent Hashing Explained
Table of Contents + −
Let me paint a picture you’ll recognize:
- You’ve got a cache spread across many servers, so your app is fast because most requests find their answer already sitting in memory.
- Then traffic grows, so you add one more server to share the load. Sounds like a good thing, right?
- But suddenly almost every request becomes a cache miss. The cache feels like it got wiped, and your database gets overloaded.
So what went wrong? You only added one server, you didn’t touch the data. The problem is in how keys get matched to servers. Let’s see why the simple way breaks, and how consistent hashing fixes it.
🎯 The Problem With Simple Hashing
First let’s define the simple approach, because it’s the one almost everyone reaches for. Say you have a key (like a user ID or a cache key) and you want to decide which server stores it.
- You run the key through a hash function, which is just a small piece of math that turns any input into a big number. Same input always gives the same number.
- Then you do
server = hash(key) % N, whereNis how many servers you have. The%is the modulo, which gives you the remainder, so the answer always lands between0andN - 1. - That remainder is the server number. So key
"alex"might land on server2, key"order42"on server0, and so on.
This works beautifully while N stays the same. The trouble shows up the moment N changes.
- Say you had
4servers and now you add one, soNgoes from4to5. - Every key gets recalculated with
% 5instead of% 4. Andhash(key) % 5is almost never the same ashash(key) % 4. - So almost every key now maps to a different server. The data is still there, but the app is looking in the wrong place, so it’s all misses.
This is called the rehashing problem. Changing N shuffles nearly all your keys at once, and that’s exactly what hurts when you add or remove a server.
Why this is so painful
When a cache server dies or you add one, modulo hashing can move almost all of your keys to new servers. That means a wave of cache misses, and every miss goes straight to your database. At scale, that sudden flood can knock the whole system over.
💍 What is Consistent Hashing
Consistent hashing is a smarter way to map keys to servers so that a change in the number of servers moves only a few keys, not all of them. Here’s the core idea.
- Imagine a circle, and we’ll call it the hash ring. Picture the numbers going all the way around it, like a clock face, and then wrapping back to the start.
- We place each server on this ring. You hash the server’s name, and that number decides its spot on the circle.
- We place each key on the same ring too, the same way. You hash the key, and that number is its spot.
- Now the rule for matching: a key is handled by the first server you hit going clockwise from the key’s spot. That’s it. Walk clockwise from the key until you bump into a server, and that server owns the key.
Here’s the ring with a few keys and servers placed on it. Each key follows the arrow clockwise to the next server.
Read that as a circle: keys and servers sit around the ring, and each key just rolls clockwise to the next server it meets. The key doesn’t care how many servers exist in total. It only cares about the next one along the circle.
⚙️ Why It Helps
So why does putting things on a ring make adding a server painless? Let’s walk through it.
- When you add a new server, it gets one spot on the ring. Going clockwise, it now sits in front of some keys that used to roll past it to the next server.
- Only those nearby keys move to the new server. Every other key still meets the same server it did before, so it stays exactly where it was.
- Removing a server is the same story in reverse. Only the keys that belonged to that one server roll forward to the next server clockwise. Everyone else is untouched.
So instead of reshuffling everything, a change only ripples through a small slice of the ring. Here’s the comparison side by side.
| Situation | Simple Modulo Hashing | Consistent Hashing |
|---|---|---|
| Add a server | Almost all keys move | Only keys near the new server move |
| Remove a server | Almost all keys move | Only that server’s keys move |
| Cache misses after a change | Huge spike, near total wipe | Small, only the moved keys miss |
| Safe to scale up or down | Risky, causes a flood | Yes, the impact stays small |
That last row is the whole point. Consistent hashing makes scaling up and down a calm, everyday operation instead of a scary event.
🧩 Virtual Nodes
The plain ring has a hidden catch. If you only place each server once, the servers can land in unlucky spots and split the ring unevenly.
- Imagine three servers, but two of them land right next to each other on the ring. The third one ends up owning a huge clockwise stretch, so it gets most of the keys.
- That overloaded server is called a hotspot, which just means one node doing way more work than the others. Not what we want.
The fix is virtual nodes. Here’s the idea.
- Instead of placing each real server once, you place it at many points around the ring. You do this by hashing names like
ServerA-1,ServerA-2,ServerA-3, and so on. - Each of those points is a virtual node, and they all point back to the same real server.
- With many small pieces scattered around the circle, each server ends up owning lots of little slices instead of one big chunk. So the keys spread out much more evenly.
Virtual nodes do double duty
Spreading each server across many points also makes failures gentler. When a server dies, its many small slices get shared among several other servers, instead of dumping one giant load onto a single neighbor.
🌍 Where It’s Used
This isn’t just theory. Consistent hashing quietly runs underneath a lot of systems you’ll meet.
- Distributed caches. Tools like Memcached and Redis clusters use it so cache servers can be added or removed without wiping everything. This connects straight back to Distributed Caching.
- Sharded databases. When data is split across many machines, consistent hashing decides which shard holds which key. That’s the heart of Database Sharding, where a shard is just one slice of the data on one machine.
- Cassandra and DynamoDB. These large distributed databases use a hash ring to spread data across nodes, which lets them grow to huge scale and survive nodes coming and going.
- Load balancers. Some load balancers use it to keep sending the same user to the same backend server, which is handy when that server is holding their session.
The common thread is the same everywhere: machines come and go, and you want that to disturb as little data as possible.
⚠️ Common Mistakes and Misconceptions
A few ideas trip people up here, so let’s clear them out.
- “Modulo hashing is fine, even at scale.” It’s fine while the number of servers never changes. But the second you add or remove one, it reshuffles almost everything, and at scale that flood of misses can take you down.
- “Consistent hashing balances perfectly on its own.” Not quite. A plain ring can put servers in clumps and create hotspots. You need virtual nodes to actually get an even spread.
- “When servers change, no keys move at all.” Some keys still move, that’s unavoidable. The win is that only a small share moves, not nearly all of them.
- “The hash ring is a real physical circle somewhere.” No, the ring is just a mental model. Underneath it’s numbers and a lookup, but picturing the circle is what makes the clockwise rule click.
🛠️ Design Challenge
Try this one on your own to test the idea.
You’re running a distributed cache with 4 servers using simple hash(key) % N. One server crashes at peak traffic.
- Roughly what fraction of your keys end up mapping to a different server now, and why?
- Now redesign it with a hash ring. If that same server crashes, which keys actually move, and where do they go?
- Finally, add virtual nodes to your design. Explain how that changes where the crashed server’s load lands.
Sketch the ring on paper and walk a few keys around it clockwise. Seeing the keys move (or not move) is the fastest way to make this stick.
🧩 What You’ve Learned
You can now explain why scaling a cache or database doesn’t have to wipe everything. Here’s what you’ve picked up.
- ✅ Simple
hash(key) % Nremaps almost every key whenNchanges, which is the rehashing problem. - ✅ Consistent hashing puts servers and keys on a circular hash ring, and each key goes to the next server clockwise.
- ✅ Adding or removing a server only moves the keys near it, not all of them.
- ✅ Virtual nodes place each server at many points so keys spread evenly and avoid hotspots.
- ✅ It powers distributed caches, sharded databases, and systems like Cassandra and DynamoDB.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
Why does simple hash(key) % N break when you add a server?
Why: Since the result depends on N, changing N reshuffles nearly all keys at once, which is the rehashing problem.
- 2
On a hash ring, which server owns a key?
Why: You walk clockwise from the key's position and the first server you meet owns that key.
- 3
Why are virtual nodes used in consistent hashing?
Why: Virtual nodes scatter each server across many small slices, which balances the load and softens failures.
- 4
When a server is removed from the ring, what happens to the keys?
Why: Only the keys owned by the removed server move to the next server clockwise; every other key stays put.
🚀 What’s Next?
Consistent hashing is the glue that makes scaling out safe. Next, see the systems that lean on it.
- Database Sharding shows how data gets split across many machines, and where the hash ring decides the split.
- Distributed Caching shows how caches stay fast across many servers, and why moving only a few keys matters so much.
Get those two down, and you’ll have a solid grip on how real systems grow without falling over.