B+Tree Indexes
The Workhorse of Database Indexing
B+Trees are the default index type in most databases. They handle both point queries and range scans efficiently.
Prerequisites: Read Index Fundamentals to understand how indexes store pointers, not data.
O(log N)
Lookup Time
Sorted Order
Enables ranges
3-4 Levels
For billions of rows
Sequential IO
For range scans
Visual: Physical Layout with B+Tree Index
Query Example: Finding Songs in a Date Range
Query: Find all songs released between days 12000-14000
Range Query Performance
Why B+Trees Excel at Ranges
-- Find songs released between March-July 2000 (around days 12000-14000 since 1970)
SELECT * FROM songs
WHERE release_date_since_1970 BETWEEN 12000 AND 14000
-- Find songs released in the 1990s (days 7300-11000 since 1970)
SELECT * FROM songs
WHERE release_date_since_1970 BETWEEN 7300 AND 11000
Step 1
Find start: 3 IOs
Step 2
Scan leaves sequentially
Step 3
Stop at end key
Result
Mostly sequential IO!
B+Tree Operations (Pseudocode)
// Build B+Tree from Songs table on release_date_since_1970
BTreeIndex(songs_table, release_date_since_1970):
// First: sort songs by release date (days since 1970)
sorted_songs = sort(songs_table, by=release_date_since_1970)
// Build leaf level from sorted data
leaves = []
for chunk of FANOUT listens in sorted_listens:
leaf = create_leaf_node(chunk) // Stores (days, page#)
link leaf to next leaf // For range scans!
leaves.append(leaf)
// Build internal nodes bottom-up
current_level = leaves
while len(current_level) > 1:
next_level = []
for chunk of FANOUT nodes in current_level:
parent = create_internal_node(chunk)
next_level.append(parent)
current_level = next_level
return current_level[0] // Root
// Point Lookup - Find song released on specific day
Lookup(root, day_12500):
node = root
// Walk down the tree (12500 > 3187 and < 14821, go middle)
while not node.is_leaf():
node = node.find_child(12500) // Binary search, 1 IO
// Search in leaf - returns page numbers
return node.find(12500) // Returns: [Page 5]
// Range Query - Find songs released between days 12000-14000
Range(root, start_day=12000, end_day=14000):
// Find starting leaf
leaf = find_leaf(root, 12000) // 3 IOs to reach leaf
pages_to_read = set()
// Scan leaves sequentially
while leaf and leaf.min_day <= 14000:
for (day, page_num) in leaf:
if 12000 <= day <= 14000:
pages_to_read.add(page_num)
leaf = leaf.next // Follow pointers - sequential!
// Now read the actual data pages
return read_pages(pages_to_read) // Pages 4, 5, 6
The Magic of Fanout
How Fanout is Calculated
📐 B+Tree Design Formula
Fanout = ⌊Page Size / (Key Size + Pointer Size)⌋ Height = ⌈log_fanout(Number of Keys)⌉ Simple Examples: • 4KB page, 4-byte key, 4-byte pointer Fanout = ⌊4096 / (4 + 4)⌋ = 512 • 8KB page, 4-byte key, 4-byte pointer Fanout = ⌊8192 / (4 + 4)⌋ = 1024 ≈ 1000 • 4KB page, 20-byte key, 8-byte pointer Fanout = ⌊4096 / (20 + 8)⌋ = 146
Real-World Examples: Impact of Page Size and Key Size
| Page Size = 64 KB (0.0625 MB) | |||||
|---|---|---|---|---|---|
| Keys | Key Size | Index Size | Fanout | Height | IOs per Lookup |
| 1M | 8 bytes | 15 MB | 4,096 | 2 | 2 reads |
| 1 Trillion | 8 bytes | 14.9 TB | 4,096 | 4 | 4 reads |
| 1M | 1 KB | 969 MB | 63 | 4 | 4 reads |
| 1 Trillion | 1 KB | 969 TB | 63 | 7 | 7 reads |
| Page Size = 64 MB (1000× larger!) | |||||
|---|---|---|---|---|---|
| Keys | Key Size | Index Size | Fanout | Height | IOs per Lookup |
| 1M | 8 bytes | 15 MB | 4,194,304 | 1 | 1 read! |
| 1 Trillion | 8 bytes | 14.9 TB | 4,194,304 | 2 | 2 reads! |
| 1M | 1 KB | 961 MB | 65,027 | 2 | 2 reads! |
| 1 Trillion | 1 KB | 961 TB | 65,027 | 3 ⚡ | 3 reads! |
Key Insight: Page Size Changes Everything!
- 64 KB pages: 1 trillion keys with 1KB search keys = 7 IOs (slow!)
- 64 MB pages: Same scenario = only 3 IOs (2× faster!)
- Trade-off: Larger pages = more memory per node, but shallower trees
- Real DBs: PostgreSQL uses 8KB, MySQL InnoDB 16KB, Cloud analytics (BigQuery/Snowflake) use MB-sized blocks for columnar data
Key Takeaways
-
Fanout is the secret - F=1000 means log₁₀₀₀(N) levels, not log₂(N)
-
Shallow trees - Billion rows = only 3-4 IOs
-
Range queries shine - Sorted leaves + pointers = sequential scan
-
Column indexes save space - Index just what you query, not entire rows
-
N vs n tradeoff - Store n entries but only N pages on disk