CAP Theorem
Pick Two (But It's Complicated)
In the realm of distributed systems, the CAP theorem is your reality check. It states that a distributed data store can only provide two out of the following three guarantees: Consistency, Availability, and Partition Tolerance. Here's the kicker: Partition Tolerance isn't optional. When the network decides to throw a tantrum—and it will—you'd better be prepared.
The Harsh Reality
In college, you assume the network works. In distributed systems, you assume the network is actively trying to destroy your career. That's why Partition Tolerance is mandatory.
What Do These Actually Mean?
Consistency means every read receives the most recent write. Availability ensures every request receives a response, success or failure. Partition Tolerance means the system continues to operate despite network partitions. You can't have it all, so choose wisely.
The Trade-offs in Practice
When you're designing a system, you're essentially choosing between Consistency and Availability once Partition Tolerance is a given. This isn't just theoretical hand-waving; it's about real-world operations and the cold, hard cash they affect.
Real Systems: It's a Spectrum
| System | Typically | Tunable? | Note |
|---|---|---|---|
| MongoDB | CP | Yes | writeConcern, readConcern |
| Cassandra | AP | Yes | Quorum reads/writes → CP |
| PostgreSQL | CA* | No | *Single node only |
| DynamoDB | AP | Yes | Eventually/Strong consistency |
| etcd/Consul | CP | No | Raft consensus |
| CouchDB | AP | No | Multi-master, eventual |
| Redis | CP/CA | Yes | Depends on configuration |
Key Takeaways
1. It's a Useful Model (Not a Law)
CAP isn't a strict rulebook. It's a framework for understanding trade-offs. Most systems offer tunable consistency levels, so use CAP to predict system behavior when things go south.
2. P is Not Optional
Network failures are a given. The real decision is between Consistency and Availability. Start by designing for partition tolerance.
3. Choose Based on Requirements
Align your system design with business needs. Financial systems demand Consistency. Social media thrives on Availability. Analytics can tolerate stale data, so Availability takes precedence.