Consensus & Leader Election

Concept. Consensus is how a cluster of nodes agrees on a single value (or sequence of values) despite some of them crashing or messages being delayed; leader election is the special case of agreeing on which node leads, done rarely (slow, ~200 ms) so the elected leader can then serve operations cheaply (~50 ms each).

Intuition. When Spotify's "current Top 50" needs to be updated, every shard sees the same write, but only one node, the elected leader, decides the order. If the leader crashes, Raft runs a quick election among the survivors (~200 ms); the new leader picks up the log where the old one left off, and the next million updates each take ~50 ms because no consensus round is needed per write.

Imagine you're holding the last Taylor Swift ticket. Alice and Bob both want it, and two servers receive their requests at the same time. Without coordination, you end up selling the same ticket twice, and Ticketmaster isn't thrilled about that.


Quorum Sizes

Nodes Majority Tolerate Use Case
3 2 1 failure Dev/Test
5 3 2 failures Production
7 4 3 failures Critical

Formula: Majority = ⌊N/2⌋ + 1

Election Timeline

Leader election timeline: when the leader fails, followers detect, vote, and elect a new leader


Key Takeaways

1. Core Problem

Split-brain kills distributed systems. Multiple nodes think they're the leader, leading to data inconsistency and corruption. The solution? Majority voting with more than 50% agreement.

2. The Raft Way

Three states: Follower → Candidate → Leader. An election is triggered by a timeout, with candidates requesting votes from the majority. Recovery time is about 200ms after a failure.