Consensus Algorithms Overview

Picture five servers running together as one system. Now here’s the tricky part:

  • They all need to agree on something simple, like who the leader is right now, or which write should win when two come in at once.
  • And they have to agree on the same answer, even if one or two of them crash mid-decision.
  • On top of that, the network between them is unreliable. Messages get delayed, dropped, or arrive out of order.

So how do a bunch of separate machines, with no shared brain, all land on the exact same answer? That’s the whole job of a consensus algorithm. Let’s break it down.

🎯 The Problem

Here’s the pain you run into the moment you have more than one server:

  • In a distributed system (a bunch of machines working together as one), the nodes have to agree on a single value or decision.
  • A “node” is just one machine in the group. And they need agreement on things like which server is the leader, or what the next entry in a shared log should be.
  • But machines crash. The network drops messages. Two nodes might think they’re both in charge at the same time.

So the real challenge is this: get every working node to agree on one answer, despite failures and a flaky network. If you get this wrong, two nodes accept different “winning” writes, and now your data is corrupted. That’s the nightmare consensus is built to prevent.

🤝 What is Consensus

Let’s define the word plainly first.

  • Consensus means getting multiple nodes to agree on one value or decision and stick to it.
  • “Stick to it” is the key part. Once the group has agreed on an answer, nobody is allowed to quietly change their mind later.
  • So if the group decides “Server 3 is the leader,” then every healthy node believes Server 3 is the leader, and they keep believing it until a new round of agreement happens.

Think of it like a group of friends picking one restaurant. Everyone chats, votes, and the moment the group settles on “pizza place,” that’s locked in. You don’t get half the group showing up at pizza and the other half at the burger spot. Consensus is what stops that split.

🗳️ The Core Idea: Majority / Quorum

Here’s the clever trick that makes all of this work even when servers are down.

  • A quorum is just a majority of the nodes. With five servers, a quorum is any three of them.
  • The rule is simple: a decision counts as final once a majority agrees on it. You do not need every single node to say yes.
  • This is why the system keeps working when nodes crash. With five servers, two can be down and the other three (a majority) can still decide.

Now why a majority and not, say, any two nodes? Because any two majorities of the same group must share at least one node. So two conflicting decisions can never both get a majority. That one overlapping node would have to vote for both, and it won’t. That overlap is the secret that keeps everyone consistent.

Here’s the flow when a value gets agreed on:

Yes

No

A node proposes a value

Sends it to the other nodes

Each node votes yes or no

Majority agrees?

Value is committed

Proposal fails, try again

All nodes apply the same value

Why odd numbers are common

You’ll often see clusters of 3, 5, or 7 nodes. With an odd count there’s always a clear majority, so you never get a tied vote that stalls the whole decision. A 5-node cluster tolerates 2 failures and still has a working majority.

📜 Paxos

Paxos is the classic consensus algorithm, the one that started it all.

  • It’s proven correct, so if you follow it exactly, the nodes will always agree safely.
  • But it’s also famously hard to understand. Even experienced engineers struggle to read the original description and turn it into working code.

So Paxos is respected, widely cited, and rock-solid in theory, but in practice people kept wishing for something easier to actually build.

🛟 Raft

Raft was designed to fix exactly that pain. The goal was a consensus algorithm a normal human can understand.

  • It works by electing one leader node, and that leader is the only one that accepts new writes.
  • The leader keeps a log (an ordered list of decisions) and copies it to the other nodes. Copying that log around is called log replication, and it’s how everyone stays in sync.

Because it leans on a clear leader and a simple replicated log, Raft is much easier to reason about. That’s why you see it far more often in real systems today.

⚖️ Paxos vs Raft

Both reach the same goal, safe agreement, but they feel very different to work with.

Aspect Paxos Raft
Main goal Be provably correct Be understandable
Ease of learning Famously hard Built to be clear
Leader Optional, less explicit One clear elected leader
How it spreads decisions Per-value agreement rounds Leader replicates a log
Where you’ll see it Older, academic, some Google systems etcd, Consul, many modern tools

🌍 Where It’s Used

You might be thinking, okay but where does this actually show up? It’s everywhere underneath the systems you already use.

  • Leader election is the classic case. The nodes use consensus to agree on which one is the boss, so writes don’t conflict.
  • Replicated databases use it to make sure every copy of the data applies writes in the same order, so all copies stay identical.
  • Coordination services like ZooKeeper and etcd exist almost entirely to run consensus for other systems. Tools like Kubernetes lean on etcd to keep their shared state agreed-upon and consistent.

So you rarely write consensus yourself. Instead you reach for one of these battle-tested services and let it do the hard agreement work for you. We go deeper on the boss-picking part in the Leader Election lesson.

⚠️ Common Mistakes and Misconceptions

A few ideas trip people up early. Let’s clear them out.

  • “All nodes must agree.” No, only a majority. That’s the whole point of a quorum. Demanding every node agree would mean a single crashed node could freeze the system.
  • “Consensus is instant and free.” Not at all. It takes several rounds of messages across the network, so it adds real latency. You use it for important decisions, not for every tiny operation.
  • “I should write my own consensus algorithm.” Please don’t. These algorithms have nasty edge cases that are very easy to get subtly wrong. Use a proven library or service like etcd or ZooKeeper instead.

Split brain is the danger

If a system lets two different nodes both think they’re the leader at once, that’s called split brain, and they can accept conflicting writes that corrupt your data. The majority rule is exactly what prevents this. Only the side with a quorum is allowed to act.

🛠️ Design Challenge

Try this one on your own to test yourself.

Imagine you run a 5-node cluster, and a network problem splits it into two groups: one group of 3 nodes and one group of 2 nodes. Now answer these:

  • Which group is allowed to keep accepting writes, and why?
  • What should the smaller group of 2 do while it’s cut off?
  • What happens once the network heals and all 5 can talk again?

Walk through it using the majority rule. This is exactly the kind of reasoning a system design interviewer is looking for.

🧩 What You’ve Learned

You can now explain how a group of machines agrees on one answer despite failures. Here’s what you’ve picked up.

  • ✅ Consensus means getting multiple nodes to agree on one value and stick to it, even when some crash.
  • ✅ A quorum is a majority of nodes, and a majority can decide even while some nodes are down.
  • ✅ Majorities always overlap, which is what stops two conflicting decisions from both winning.
  • ✅ Paxos is the classic, correct, but hard-to-understand algorithm.
  • ✅ Raft was built to be understandable, using a clear leader and log replication, and it’s more common today.
  • ✅ Consensus powers leader election, replicated databases, and coordination services like ZooKeeper and etcd.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What does consensus mean in a distributed system?

    Why: Consensus is reaching one shared decision that all healthy nodes accept and keep, despite failures and an unreliable network.

  2. 2

    Why is a majority (quorum) used to decide a value?

    Why: Any two majorities of the same group share at least one node, and that overlap prevents two conflicting decisions from both passing.

  3. 3

    How many node failures can a 5-node cluster tolerate?

    Why: With two nodes down, three remain, which is still a majority, so the cluster can keep making decisions.

  4. 4

    Why is Raft often preferred over Paxos in practice?

    Why: Raft leans on one elected leader and a replicated log, which makes it far easier to build and debug than Paxos.

🚀 What’s Next?

Now that you’ve got the big picture of agreement, let’s zoom into the pieces.

  • Leader Election digs into how nodes actually pick one boss and what happens when that boss fails.
  • Distributed System Challenges covers the wider set of headaches, like network partitions and consistency, that consensus helps you tackle.

Share & Connect