IO Algorithms: Cost Analysis & Intuition
CS145 Equations Sheet
For detailed explanations and examples, see IO Cost Reference.
Real-World Examples: Choosing the Right Join Algorithm
Assume C_r = C_w = 1. For the JOINs below, we always add OUT (for the size of the output result). Always use the Equation Sheet for the full formulae.
Test Setup: B = 100 pages (≈ 6.4 GB RAM), Page Size = 64 MB
Scenario Overview
| Example | Description | Table R | Table S | Special Conditions | Key Insight |
|---|---|---|---|---|---|
| Ex1 | Small Tables | Songs 10 GB |
Listens 10 GB |
— | Tables fit mostly in RAM |
| Ex2 | Large Tables | Songs 100 GB |
Listens 2 TB |
— | Tables >> buffer size |
| Ex3 | Pre-sorted Tables | Songs ↑ 100 GB sorted |
Listens ↑ 2 TB sorted |
Pre-sorted | SMJ skips sorting phase |
| Ex4 | Self-Join | Listens 2 TB |
Listens 2 TB (same) |
Self-join | Can't optimize table order |
| Ex5 | Song Similarity | Listens 100B rows |
Listens 100B rows |
Massive | Real-world similarity query |
Algorithm Performance Comparison (Log Scale)
- SMJ-BadBackup: When many duplicates exist, SMJ may need to backup and re-scan
- HPJ-BadSkew: When data is heavily skewed, some hash partitions become too large
- These edge cases can increase costs by 2-10× in worst cases
Key Observations
Example 1: Small Tables Win with BNLJ
-
Songs (10GB) and Listens (10GB) are small relative to buffer size (6.4GB)
-
BNLJ performs the join mostly in RAM without preprocessing overhead
-
No sorting or partitioning needed → Lower cost than SMJ/HPJ
Examples 2-5: Large Tables Need Smart Algorithms
-
When tables >> buffer size, BNLJ becomes prohibitively expensive
-
SMJ and HPJ scale much better with data size
-
Pre-sorted data (Ex3) gives SMJ a huge advantage
Self-Joins Are Common and Expensive
-
Examples 4-5 show real patterns (finding similar songs, user correlations)
-
Self-joins can't optimize by choosing smaller outer relation
-
SMJ/HPJ are essential for these massive operations
The Query Optimizer's Decision Process
Exercise 2: Impact of Machine Configuration on IO Costs
Scenario: Songs3 (100GB) × Listens3 (1TB)
-
Songs3: 100GB = 1,600 pages
-
Listens3: 1TB = 16,384 pages
-
Row size: 1024 bytes
Cost Analysis Across Different Configurations
| Buffer Size (B) |
C_r | C_w | BNLJ | BNLJ-rev | SMJ-best | HPJ-best | Winner |
|---|---|---|---|---|---|---|---|
| 10 (tiny) |
1 | 1 | 2.6M | 2.6M | 159K | 54K | HPJ |
| 1 | 10 | 2.6M | 2.6M | 792K | 216K | HPJ | |
| 10 | 1 | 26M | 26M | 954K | 378K | HPJ | |
| 10 | 10 | 26M | 26M | 1.6M | 540K | HPJ | |
| 100 (small) |
1 | 1 | 264K | 279K | 90K | 54K | HPJ |
| 1 | 10 | 264K | 279K | 414K | 216K | HPJ | |
| 10 | 1 | 2.6M | 2.8M | 575K | 378K | HPJ | |
| 10 | 10 | 2.6M | 2.8M | 899K | 540K | HPJ | |
| 1,000 (medium) |
1 | 1 | 34K | 44K | 87K | 54K | BNLJ |
| 1 | 10 | 34K | 44K | 396K | 216K | BNLJ | |
| 10 | 1 | 344K | 436K | 558K | 378K | BNLJ | |
| 10 | 10 | 344K | 436K | 867K | 540K | BNLJ | |
| 10,000 (large) |
1 | 1 | 18K | 20K | 54K | 54K | BNLJ |
| 1 | 10 | 18K | 20K | 216K | 216K | BNLJ | |
| 10 | 1 | 180K | 196K | 378K | 378K | BNLJ | |
| 10 | 10 | 180K | 196K | 540K | 540K | BNLJ |
Key Observations
Buffer Size Impact
- Small B (10-100): HPJ dominates
- Medium B (1,000): BNLJ becomes competitive
- Large B (10,000): BNLJ wins.
Read/Write Cost Impact
- High C_w: Penalizes algorithms with heavy writes (SMJ sorting phase)
- High C_r: All costs scale up proportionally
- Asymmetric costs: Can change winner even with same B
Practical Implications
-
Cloud Environments: With variable RAM allocation per query, HPJ often wins due to its consistent performance
-
SSDs vs HDDs: SSDs have C_r ≈ C_w, while HDDs often have C_w > C_r
-
Memory-Rich Systems: BNLJ becomes optimal when B is large enough to hold significant chunks
-
Multi-Query Workloads: Conservative B estimates favor HPJ's predictable performance
Key Takeaways
-
Just 5 Algorithms power most database operations
-
Memory (B) determines which algorithm is best
-
No memorization needed: Formulas follow intuition