IO Cost Summary: Choosing Your Index
Visual: Index Types Head-to-Head
IO Cost Comparison

Understanding n vs N: The Dictionary Analogy
Webster's Dictionary (Sorted)
n = 171,476 words
From "aardvark" to "zyzzyva"
N = 1,000 pages
Physical book pages
Index size: Just N entries
First word per page + page number
Example: Page 42: "bicycleβbind"
To find "binomial": Check index, go to page 43, scan
RandomWebster's Dictionary (Unsorted)
n = 171,476 words
Randomly scattered across pages
N = 1,000 pages
Same physical book
Index size: All n entries
Every word + its page location
Example: "aardvark" p.847, "apple" p.203
Must track every single word individually
The Key Insight
Sorted data compresses beautifully (N << n). Unsorted data requires tracking everything.
Sorted data compresses beautifully (N << n). Unsorted data requires tracking everything.
For detailed formulas and specifications, see the IO Reference Guide.
Decision Flowchart
Need range queries?
β
No βββββββββββββββ Yes
β β
Writes > 10Γ Reads? B+Tree β
β (PostgreSQL)
No βββββββββββ Yes
β β
Hash β LSM β
(Redis) (Cassandra)
Database Systems: Who Uses What?
PostgreSQL
B+Tree (balanced workloads)
Redis
Hash (cache, sessions)
Cassandra
LSM (write-heavy)
MongoDB
B+Tree primary + LSM for oplog
Hybrids
Real systems use multiple indexes for different jobs:
B+Tree for primary data β’ LSM for write-ahead logs β’ Hash for caches
Pick the right tool for each workload, not one index for everything!