Consistency in Distributed Systems
Table of Contents + −
Here’s something you’ve probably seen happen:
- You change your profile photo on some app, and it updates right away on your screen.
- But your friend Alex opens the same app, and for a few seconds Alex still sees your old photo.
- A moment later, Alex refreshes, and now the new photo shows up too.
So what’s going on there? For those few seconds, two people were looking at the same thing and seeing two different answers. That gap is what this whole lesson is about. It’s called consistency, and it’s one of the trickiest parts of building any system that runs on more than one machine.
🎯 Why Consistency Is Tricky
The trouble starts the moment your data lives in more than one place. Let’s see why systems do that:
- A big app can’t keep all its data on a single machine. One machine isn’t enough, and if it dies, your data is gone.
- So the data gets copied onto several machines. Each copy is called a replica. A replica is just one full copy of your data sitting on its own machine.
- The act of making and updating those copies is called replication. Replication keeps the copies around so the system stays fast and survives a machine going down.
That sounds great, but here’s the catch:
- When you write something new, only one replica gets the update first.
- That new value then has to travel to all the other replicas, and that travel takes a little time.
- During that tiny window, some replicas have the new value and some still have the old one. Keeping every copy in agreement, all the time, is the hard part.
That little window is exactly why Alex saw your old photo. One replica had your new photo, but the replica Alex read from hadn’t gotten the update yet.
✅ What is Consistency
So let’s pin down the word. In plain terms:
- Consistency means all the copies of your data show the same, up-to-date value.
- Put another way, when you read the data, you get back the latest thing that was written, not some old leftover value.
- If every replica agrees and a read always reflects the most recent write, your system is consistent.
When the copies disagree, even for a moment, that’s an inconsistency. And reading an old value during that gap has a name too, it’s called a stale read. Stale just means out of date.
One thing to keep straight
Consistency here is about copies of data agreeing with each other across machines. It’s not about your data following rules like “age can’t be negative”. That’s a different idea. In this lesson, consistency always means: do all the replicas show the same latest value?
Now, systems don’t all handle this the same way. Some insist that everyone always sees the latest value. Others are okay letting copies disagree for a short bit. Those two styles have names, so let’s look at each.
💪 Strong Consistency
The first style is the strict one. Here’s the deal:
- With strong consistency, every read always sees the latest write, no matter which replica you read from.
- The moment a write finishes, all replicas are treated as updated. Nobody ever sees an old value.
- So if you change your photo, the system makes sure Alex sees the new one immediately, never the old one.
That sounds perfect, so what’s the cost? It isn’t free:
- Before the system says “your write is done”, it has to make sure enough replicas have the new value. That coordination takes extra time, so writes and reads can feel slower.
- And if some replicas can’t be reached right now, the system might rather make you wait, or even refuse, than risk handing back an old value. So strong consistency can hurt availability, which just means the system being up and ready to answer you.
In short, strong consistency is safer because you’re never wrong, but it’s slower and a little more fragile when machines go missing.
🕒 Eventual Consistency
The second style is the relaxed one. Here’s how it works:
- With eventual consistency, the copies are allowed to disagree for a short moment, but they’re guaranteed to catch up and end up with the same value soon.
- The word “eventual” is the key. It doesn’t mean “maybe”. It means the replicas will converge, and converge just means they all settle on the same value after a brief delay.
- So when you change your photo, the new value is saved right away on one replica, and it quietly spreads to the others over the next moment.
What do you get, and what do you give up?
- You get speed and availability. The system answers you fast and stays up, because it doesn’t wait for every replica to agree before responding.
- You give up that instant agreement. For a short window, a read from a not-yet-updated replica gives you a stale value. That’s exactly the few seconds Alex saw your old photo.
So eventual consistency trades a brief stale read for a faster, more available system. And here’s the picture of that write spreading out from one replica to the rest:
⚖️ Strong vs Eventual
Let’s put the two side by side so the difference is easy to hold in your head.
| Aspect | Strong Consistency | Eventual Consistency |
|---|---|---|
| What it guarantees | Every read sees the latest write, always | Reads may be briefly stale, but copies catch up soon |
| Speed and availability | Slower, can refuse to answer when replicas are unreachable | Faster, stays up and answers even during trouble |
| Example use | Bank balance, payments, inventory count | Likes, view counts, social feeds |
🧩 Which to Choose
There’s no “better” one here. It depends on what your data is for. Ask yourself: would a stale read cause real harm?
- If a wrong-for-a-moment value could cost money or break things, go strong. Think banking, where seeing an old balance could let you spend money twice. Or inventory, where two people must not buy the last item at the same time.
- If a stale value for a second is totally fine, go eventual. Think likes on a post, a view count on a video, or a friend’s feed. Nobody gets hurt if a like count is off by one for a moment.
This choice ties straight back to the CAP theorem, which says that when machines can’t talk to each other, you have to pick between staying consistent and staying available:
- Strong consistency leans toward the consistent side. It would rather wait or refuse than give a wrong answer.
- Eventual consistency leans toward the available side. It would rather answer fast, even if the answer is briefly stale.
So picking your consistency style is really you deciding which side of that trade-off your app needs.
🌍 Real Examples
This isn’t just theory. Real systems make this choice every day:
- A bank’s core ledger uses strong consistency. When you transfer money, both accounts must agree on the new balance instantly, so you can never overdraw by reading a stale number.
- An online store’s stock count for a hot item leans strong, so two buyers don’t both grab the last unit.
- Social media like counts and feeds use eventual consistency. A new like spreads to all replicas over a moment, and a slightly stale count on someone else’s screen bothers nobody.
- A shopping cart often uses eventual consistency too. Big systems like Amazon’s famous Dynamo design chose to stay available and let carts converge, because an always-up cart matters more than a perfectly-in-sync one every millisecond.
⚠️ Common Mistakes and Misconceptions
A few ideas trip people up here. Let’s clear them out:
- “Eventual consistency means the data is wrong or gets lost.” No. The data isn’t lost and it isn’t wrong forever. The copies just take a short moment to catch up, then they all agree. Eventual means converging, not broken.
- “Always use strong consistency to be safe.” It’s tempting, but it’s costly. Strong consistency makes things slower and can take your system offline when machines can’t reach each other. For likes and view counts, that price buys you nothing.
- “I can ignore stale reads with eventual consistency.” Be careful here. If your app shows a value the user just changed, a stale read can confuse them. Sometimes you want a middle ground, like making sure a user at least sees their own latest write, even if others lag a bit.
🛠️ Design Challenge
Try this one on your own to test yourself.
Imagine you’re designing a system that shows two things: a user’s bank balance and the number of comments on a blog post. For each one, decide whether you’d use strong or eventual consistency, and write down why. For example:
- For the bank balance, would a stale read ever cause real damage? What happens if two reads disagree?
- For the comment count, is a wrong-by-one number for two seconds actually a problem?
Then push it further. If a replica goes down mid-request, what should each feature do, wait or answer anyway? That’s the exact reasoning interviewers want to see.
🧩 What You’ve Learned
You can now explain why two people sometimes see different values, and how systems handle it. Here’s what you’ve picked up.
- ✅ Data gets copied onto many machines as replicas, and replication keeps those copies in sync.
- ✅ Consistency means all replicas agree, so a read reflects the latest write.
- ✅ Strong consistency always shows the latest value, but it’s slower and can hurt availability.
- ✅ Eventual consistency lets copies be briefly stale, then converge, giving you speed and availability.
- ✅ Banking and inventory want strong; likes, feeds, and view counts are fine with eventual.
- ✅ The choice ties straight back to the CAP theorem’s consistency versus availability trade-off.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
What does consistency mean in a distributed system?
Why: Here consistency means all replicas show the same, up-to-date value.
- 2
What is a stale read?
Why: A stale read returns an out-of-date value during the short catch-up window.
- 3
Which is the main trade-off of strong consistency?
Why: Strong consistency coordinates replicas before answering, which costs speed and availability.
- 4
Which workload is a good fit for eventual consistency?
Why: A like count being briefly off by one harms nobody, so eventual consistency fits well.
🚀 What’s Next?
This lesson gave you the language to reason about copies of data agreeing. Next, go deeper into the trade-off behind it.
- CAP Theorem Explained shows why you can’t always have consistency and availability at the same time.
- SQL vs NoSQL breaks down which databases lean strong and which lean eventual, and when to reach for each.
Get those two down, and you’ll have the core trade-offs every system design interview keeps coming back to.