B+Tree Indexes

Concept. A B+Tree index keeps keys sorted in a balanced tree, supporting O(log N) lookups and range scans (WHERE rating BETWEEN 4.0 AND 5.0) by walking the leaf pages in order.

Intuition. When you query "find Listens with rating between 4.0 and 5.0," a B+Tree on rating walks down the tree to the first 4.0, then reads the leaf pages in order until it passes 5.0 — no need to revisit the tree for each value.

The Workhorse of Database Indexing

B+Trees are the go-to index type for most databases, adept at handling both point queries and range scans with ease.

Prerequisites: Familiarize yourself with Index Fundamentals to grasp how indexes store pointers, not data.

O(log N)

Lookup Time

Sorted Order

Enables ranges

3-4 Levels

For billions of rows

Sequential IO

For range scans

Physical Layout with B+Tree Index

Query Example: Finding Songs in a Date Range

Query: Find all songs released between days 12000-14000

Range Query Performance

Why B+Trees Excel at Ranges

-- Find songs released between March-July 2000 (around days 12000-14000 since 1970)
SELECT * FROM songs 
WHERE release_date_since_1970 BETWEEN 12000 AND 14000

-- Find songs released in the 1990s (days 7300-11000 since 1970)
SELECT * FROM songs
WHERE release_date_since_1970 BETWEEN 7300 AND 11000

Step 1

Find start: 3 IOs

Step 2

Scan leaves sequentially

Step 3

Stop at end key

Result

Mostly sequential IO!

B+Tree Operations (Optional Pseudocode)

// Build B+Tree from Songs table on release_date_since_1970
BTreeIndex(songs_table, release_date_since_1970):
    // First: sort songs by release date (days since 1970)
    sorted_songs = sort(songs_table, by=release_date_since_1970)

    // Build leaf level from sorted data
    leaves = []
    for chunk of FANOUT listens in sorted_listens:
        leaf = create_leaf_node(chunk)  // Stores (days, page#)
        link leaf to next leaf           // For range scans!
        leaves.append(leaf)

    // Build internal nodes bottom-up
    current_level = leaves
    while len(current_level) > 1:
        next_level = []
        for chunk of FANOUT nodes in current_level:
            parent = create_internal_node(chunk)
            next_level.append(parent)
        current_level = next_level

    return current_level[0]    // Root

// Point Lookup - Find song released on specific day
Lookup(root, day_12500):
    node = root

    // Walk down the tree (12500 > 3187 and < 14821, go middle)
    while not node.is_leaf():
        node = node.find_child(12500)    // Binary search, 1 IO

    // Search in leaf - returns page numbers
    return node.find(12500)    // Returns: [Page 5]

// Range Query - Find songs released between days 12000-14000
Range(root, start_day=12000, end_day=14000):
    // Find starting leaf  
    leaf = find_leaf(root, 12000)    // 3 IOs to reach leaf
    pages_to_read = set()

    // Scan leaves sequentially
    while leaf and leaf.min_day <= 14000:
        for (day, page_num) in leaf:
            if 12000 <= day <= 14000:
                pages_to_read.add(page_num)
        leaf = leaf.next    // Follow pointers - sequential!

    // Now read the actual data pages
    return read_pages(pages_to_read)  // Pages 4, 5, 6

The Magic of Fanout

How Fanout is Calculated

B+Tree Design Formula

Fanout = ⌊Page Size / (Key Size + Pointer Size)⌋
Height = ⌈log_fanout(Number of Keys)⌉

Simple Examples:
• 4KB page, 4-byte key, 4-byte pointer
  Fanout = ⌊4096 / (4 + 4)⌋ = 512

• 8KB page, 4-byte key, 4-byte pointer  
  Fanout = ⌊8192 / (4 + 4)⌋ = 1024 ≈ 1000

• 4KB page, 20-byte key, 8-byte pointer
  Fanout = ⌊4096 / (20 + 8)⌋ = 146

Real-World Examples: Impact of Page Size and Key Size

Page Size = 64 KB (0.0625 MB)
Keys	Key Size	Index Size	Fanout	Height	IOs per Lookup
1M	8 bytes	15 MB	4,096	2	2 reads
1 Trillion	8 bytes	14.9 TB	4,096	4	4 reads
1M	1 KB	969 MB	63	4	4 reads
1 Trillion	1 KB	969 TB	63	7	7 reads

Page Size = 64 MB (1000× larger!)
Keys	Key Size	Index Size	Fanout	Height	IOs per Lookup
1M	8 bytes	15 MB	4,194,304	1	1 read!
1 Trillion	8 bytes	14.9 TB	4,194,304	2	2 reads!
1M	1 KB	961 MB	65,027	2	2 reads!
1 Trillion	1 KB	961 TB	65,027	3 ⚡	3 reads!

Key Insight: Page Size Changes Everything!

64 KB pages: 1 trillion keys with 1KB search keys = 7 IOs (slow!)
64 MB pages: Same scenario = only 3 IOs (2× faster!)
Trade-off: Larger pages = more memory per node, but shallower trees
Real DBs: PostgreSQL uses 8KB, MySQL InnoDB 16KB, Cloud analytics (BigQuery/Snowflake) use MB-sized blocks for columnar data

Key Takeaways

Fanout is the secret - F=1000 means log₁₀₀₀(N) levels, not log₂(N)
Shallow trees - Billion rows = only 3-4 IOs
Range queries shine - Sorted leaves + pointers = sequential scan