Concurrency Full Stack
Every time you type BEGIN TRANSACTION, your query embarks on a journey through four crucial layers:
| Your Code | BEGIN TRANSACTION; UPDATE accounts...; COMMIT; |
| Database | Lock Manager • Conflict Detection • Deadlock Prevention |
| Operating System | Mutex • Semaphore • Spinlock • Thread Scheduling |
| Hardware | CAS Instructions • Memory Barriers • Cache Coherence |
Each layer stacks its own abstractions on the one below, all to tackle a singular challenge: How do we let thousands of transactions run simultaneously without stepping on each other?
Key Definitions
Concurrent Transactions
A set of transactions executing in a (small) time window
Concurrency Control Algorithms
Manage concurrent access to shared data when dealing with (slow) IO devices
Locks
Data structures these algorithms use to coordinate access
Kitchen Chaos: Three Ways to Share a Kitchen
The Problem: Multiple chefs, one kitchen. Everyone needs different ingredients. How to coordinate?
Pessimistic Chef
Strategy: Lock only what you need
Others can make sushi, grill fish, bake desserts
Trade-off: Good parallelism until sushi needs tomatoes
Optimistic Chef
Strategy: Grab freely, check at serving
"Wait, we both used the last tomato?"
Trade-off: Max speed until collision
Multi-Version Chef
Strategy: Use version control at all prep stages
"Current file"
Trade-off: No waits, but storage overhead
The Insight: Choose the right strategy for your workload. High conflict (everyone wants tomatoes)? Go pessimistic. Rare conflicts? Go optimistic. Read-heavy? Multi-version.
Three Philosophies of Concurrency
Now that we understand the kitchen analogy, let's see how databases actually implement these approaches:
Pessimistic: "Lock Everything First"
Strategy: Acquire ALL needed locks before touching ANY data
❌ Limited parallelism
Optimistic: "Lock Nothing, Validate Later"
Strategy: Do all work freely, then check for conflicts at commit
❌ Must restart on conflicts
Multi-Version: "Everyone Gets Their Own View"
Strategy: Keep multiple versions of every piece of data
❌ Storage overhead & garbage collection
Modern Hybrid: "Best of All Worlds"
Strategy: Combine techniques based on workload patterns
❌ Complex implementation
Why OS vs Database Locks?
You might wonder: "Why can't databases just use regular OS locks?" Here's the key difference:
OS Locks (Mutex, Semaphore)
Purpose: Simple thread coordination
• No data awareness
• Manual deadlock detection
• Thread-level coordination
Database Lock Manager
Purpose: Sophisticated data coordination
• Table→page→row awareness
• Automatic deadlock resolution
• Million-transaction scale
Today's Challenge: Coordinate millions of AI agents and microservices, all hitting the same database simultaneously.
What You'll Learn Next
This section will take you through the complete concurrency control journey:
-
Lock Foundations - How database locks actually work
-
Two-Phase Locking - The gold standard protocol
-
Microschedules - Reasoning about concurrent execution
-
Correctness - Proving your system won't corrupt data
-
Connecting Together - Putting it all into practice
By the end, you'll understand how databases manage complex coordination: letting thousands of transactions run concurrently while maintaining perfect data consistency.
The Core Insight
Concurrency control isn't just about preventing conflicts—it's about enabling massive parallelism while maintaining the illusion that each transaction runs alone.
Every technique we'll explore is a different answer to: "How do we coordinate access to shared data at web scale?"