Transaction Scheduling: The Secret to Speed
π
Two simple rules unlock massive performance gains while maintaining correctness. Often called Reordering Rules
In below, we'll use Transaction Manger (TM) or Database (DB) to refer to the "head chef" or the "kitchen."
The Correctness Challenge: Isolation for ACID π‘οΈ
π TM's Guarantee
Transactions are Isolated from each other
They never see partial changes from other transactionsβensuring correctness. Each transaction should feel as if it's alone in the database, acting independently.
Key Definitions π
π Serial Schedule
Transactions execute completely one after another with no overlap
π Interleaved Schedule
The TM interleaves individual operations from different transactions
The Performance vs Safety Trade-off βοΈ
π― The Central Challenge
π‘οΈ Maximum Safety
Serial execution: One transaction at a time
- Perfect correctness
- Easy to understand
- Terrible performance
The Sweet Spot
β‘ Maximum Performance
No coordination: Complete chaos
- Maximum parallelism
- Data corruption
- Inconsistent results
π‘ The Solution
Smart concurrency control following Rules 1 and 2 below lets us get close to maximum performance while maintaining safety guarantees.
Rule 1: DB Does Not Care About Transaction Order π
The Rule: Database systems can execute transactions in any order they choose.
Why Rule 1 Exists: Performance! β‘
π― Key Insight
Databases use this reordering flexibility to optimize around IO costs and achieve massive parallelism. If transactions T1 and T2 touch different data, they can run completely in parallel!
π Parallel Execution
Run independent transactions simultaneously across multiple CPU cores
πΎ IO Optimization
Batch disk operations and minimize expensive random access
β±οΈ Better Throughput
Process thousands of transactions per second instead of hundreds
Developer Control ποΈ
π‘ When You Need Specific Ordering
If your application requires T1 to happen before T2, create a single transaction T42 that includes both T1 and T2 logic in the desired order:
BEGIN TRANSACTION T42
-- T1 logic here
-- T2 logic here
COMMIT
Rule 2: DB Cares About Order of R/W Actions π
The Rule: Within and across transactions, the order of individual read/write operations determines correctness.
Why Rule 2 Matters: Data Integrity π‘οΈ
π― Critical Insight
Databases must ensure that transactions see consistent snapshots of data. No transaction should see the intermediate states of another transaction.
π Read Consistency
All reads within a transaction see the same consistent view of the database
βοΈ Write Isolation
Writes from one transaction are not visible to others until commit
π Serializable Result
Final result matches some serial execution order
Serial vs Interleaved: Banking Examples π°
Let's see serial and interleaved schedules in action:
Scenario: Two things happen at month-end:
-
Tβ (TransferTxn): Bob (B) transfers $100 to Alice (A)
-
Tβ (InterestTxn): Bank calculates 6% monthly interest
Understanding the Three Schedules
The image demonstrates the difference between serial and interleaved execution:
-
S1 (Serial Schedule): Tβ completes all operations, then Tβ runs β Final: A=$159, B=$106
-
S2 (Serial Schedule): Tβ completes all operations, then Tβ runs β Final: A=$153, B=$112
-
S6 (Bad Interleaved): Operations interleave incorrectly, violating isolation β Final: A=$159, B=$112
Key Insight: S1 and S2 are both correct (though different). S6 is incorrect because it doesn't match ANY serial scheduleβit violates isolation!
The Bridge to Concurrency Control π
Why We Need Locking Mechanisms
π Connecting the Dots
Rule 1 creates opportunity
Transaction-level reordering enables massive parallelism
Rule 2 creates challenge
Must carefully control R/W operation ordering
Solution: Concurrency Control
Locks, timestamps, and isolation protocols