B+Tree Indexes

The Workhorse of Database Indexing

B+Trees are the default index type in most databases. They handle both point queries and range scans efficiently.

Prerequisites: Read Index Fundamentals to understand how indexes store pointers, not data.

O(log N)
Lookup Time
Sorted Order
Enables ranges
3-4 Levels
For billions of rows
Sequential IO
For range scans

Visual: Physical Layout with B+Tree Index

B+Tree Index on Songs Table: release_date_since_1970 (integer column: days since Jan 1, 1970) Data Pages (64MB each) Page 0: 1: Evermore (4821) 2: Willow (4953) 3: Shape of You (3187) Page 1: 4: Photograph (3245) 5: Shivers (3367) 6: Yesterday (1523) Page 2: 7: Yellow Submarine (1587) 8: Hey Jude (1612) 9: Bad Blood (1899) Page 3: 10: Flowers (11234) 11: Unholy (11567) 12: As It Was (12890) 13: Heat Waves (13456) 14: Levitating (13789) Page 4: 15: Blinding Lights (13980) 16: Good 4 U (14456) 17: Drivers License (14789) B+Tree Index on Songs(release_date_since_1970) Root (1 IO) 5000 14500 25000 Internal (1 IO) 1900 3000 3500 Internal (1 IO) 11234 13789 14456 Internal (1 IO) 15000 17000 20000 Leaf Nodes (contain pointers to data pages) Leaf (1 IO) 1523: →Pg 1 1587: →Pg 2 1612: →Pg 2 ... Leaf 1900: →Pg 2 1923: →Pg 2 1967: →Pg 3 ... Leaf ... Leaf 11234: →Pg 3 11567: →Pg 3 12890: →Pg 3 13456: →Pg 3 Leaf 13789: →Pg 3 13980: →Pg 4 14456: →Pg 4 ... ... more leaves ... ← Leaf nodes linked for sequential range scans → < 5000 5000-14499 14500-24999 ≥ 25000 < 1900 1900-2999 11234-13788 13789-14455 20K-24K Leaf nodes point to data pages This Example: Fanout = 4 (4 child pointers per node) • Each node has 3 keys + 4 pointers • Height = log₄(# of keys) Real DBs: Fanout = 1000+ • 64KB pages → huge fanout!

Query Example: Finding Songs in a Date Range

Query: Find all songs released between days 12000-14000

Query Path: SELECT * WHERE release_date BETWEEN 12000 AND 14000 Root 5000 14500 25000 Step 1: Check root < 5000 (not visited) Internal 11234 13789 14456 Step 2 14500-24999 (not visited) < 11234 (not visited) Leaf 3 11234→Pg 3 12890→Pg 3 13456→Pg 3 Leaf 4 13789→Pg 3 13980→Pg 4 14456→Pg 4 12000 > 5000 and < 14500 12000 ≥ 11234 continue to 14000 Step 3: Sequential scan Page 3 (Retrieved) 10-11: Dropped 12-14: Matching Page 4 (Retrieved) 15: Blinding Lights (13980) 16-17: Not in range Query Performance ✓ 3 Index IOs (Root → Internal → Leaves) ✓ 2 Data Page IOs (Page 3 and Page 4) ✓ Total: ~5 IOs vs 160,000 for full scan Speedup: 32,000x faster!

Range Query Performance

Why B+Trees Excel at Ranges

-- Find songs released between March-July 2000 (around days 12000-14000 since 1970)
SELECT * FROM songs 
WHERE release_date_since_1970 BETWEEN 12000 AND 14000

-- Find songs released in the 1990s (days 7300-11000 since 1970)
SELECT * FROM songs
WHERE release_date_since_1970 BETWEEN 7300 AND 11000
Step 1
Find start: 3 IOs
Step 2
Scan leaves sequentially
Step 3
Stop at end key
Result
Mostly sequential IO!

B+Tree Operations (Pseudocode)

// Build B+Tree from Songs table on release_date_since_1970
BTreeIndex(songs_table, release_date_since_1970):
    // First: sort songs by release date (days since 1970)
    sorted_songs = sort(songs_table, by=release_date_since_1970)

    // Build leaf level from sorted data
    leaves = []
    for chunk of FANOUT listens in sorted_listens:
        leaf = create_leaf_node(chunk)  // Stores (days, page#)
        link leaf to next leaf           // For range scans!
        leaves.append(leaf)

    // Build internal nodes bottom-up
    current_level = leaves
    while len(current_level) > 1:
        next_level = []
        for chunk of FANOUT nodes in current_level:
            parent = create_internal_node(chunk)
            next_level.append(parent)
        current_level = next_level

    return current_level[0]    // Root

// Point Lookup - Find song released on specific day
Lookup(root, day_12500):
    node = root

    // Walk down the tree (12500 > 3187 and < 14821, go middle)
    while not node.is_leaf():
        node = node.find_child(12500)    // Binary search, 1 IO

    // Search in leaf - returns page numbers
    return node.find(12500)    // Returns: [Page 5]

// Range Query - Find songs released between days 12000-14000
Range(root, start_day=12000, end_day=14000):
    // Find starting leaf  
    leaf = find_leaf(root, 12000)    // 3 IOs to reach leaf
    pages_to_read = set()

    // Scan leaves sequentially
    while leaf and leaf.min_day <= 14000:
        for (day, page_num) in leaf:
            if 12000 <= day <= 14000:
                pages_to_read.add(page_num)
        leaf = leaf.next    // Follow pointers - sequential!

    // Now read the actual data pages
    return read_pages(pages_to_read)  // Pages 4, 5, 6

The Magic of Fanout

How Fanout is Calculated

📐 B+Tree Design Formula

Fanout = ⌊Page Size / (Key Size + Pointer Size)⌋
Height = ⌈log_fanout(Number of Keys)⌉

Simple Examples:
• 4KB page, 4-byte key, 4-byte pointer
  Fanout = ⌊4096 / (4 + 4)⌋ = 512

• 8KB page, 4-byte key, 4-byte pointer  
  Fanout = ⌊8192 / (4 + 4)⌋ = 1024 ≈ 10004KB page, 20-byte key, 8-byte pointer
  Fanout = ⌊4096 / (20 + 8)⌋ = 146

Real-World Examples: Impact of Page Size and Key Size

Page Size = 64 KB (0.0625 MB)
Keys Key Size Index Size Fanout Height IOs per Lookup
1M 8 bytes 15 MB 4,096 2 2 reads
1 Trillion 8 bytes 14.9 TB 4,096 4 4 reads
1M 1 KB 969 MB 63 4 4 reads
1 Trillion 1 KB 969 TB 63 7 7 reads
Page Size = 64 MB (1000× larger!)
Keys Key Size Index Size Fanout Height IOs per Lookup
1M 8 bytes 15 MB 4,194,304 1 1 read!
1 Trillion 8 bytes 14.9 TB 4,194,304 2 2 reads!
1M 1 KB 961 MB 65,027 2 2 reads!
1 Trillion 1 KB 961 TB 65,027 3 ⚡ 3 reads!

Key Insight: Page Size Changes Everything!

  • 64 KB pages: 1 trillion keys with 1KB search keys = 7 IOs (slow!)
  • 64 MB pages: Same scenario = only 3 IOs (2× faster!)
  • Trade-off: Larger pages = more memory per node, but shallower trees
  • Real DBs: PostgreSQL uses 8KB, MySQL InnoDB 16KB, Cloud analytics (BigQuery/Snowflake) use MB-sized blocks for columnar data

Key Takeaways

  1. Fanout is the secret - F=1000 means log₁₀₀₀(N) levels, not log₂(N)

  2. Shallow trees - Billion rows = only 3-4 IOs

  3. Range queries shine - Sorted leaves + pointers = sequential scan

  4. Column indexes save space - Index just what you query, not entire rows

  5. N vs n tradeoff - Store n entries but only N pages on disk