SSTables: Sorted String Tables

Immutable Files That Power Modern Databases

Core idea: Never update files in-place. Instead, write new sorted files and merge them later.

Immutable
Never modified after write
Sorted by Key
Enables efficient merging
Sequential Write
Created by memory flush
Self-Contained
Index + data in one file

Visual: Anatomy of an SSTable

SSTable: Structure and Operations 1. Single SSTable File Structure Bloom Filter Index Block Data Blocks (Sorted) Bloom Filter: • Quick "not exists" • 1% false positive • Saves disk reads Index: • Key → Offset • Binary search • Sparse entries Data: • Key-value pairs • Sorted by key • Compressed blocks 2. Multiple SSTables (Overlapping Keys) sstable_3.db (newest) [user_100:v3, user_150:v1, user_200:v2, .., user_400:v1] 10MB, 1hr ago sstable_2.db [user_050:v1, user_100:v2, user_200:v1,.., user_350:v1] 10MB, 2hr ago sstable_1.db (oldest) [user_001:v1, user_100:v1, user_150:v1, .. user_600:v1] 10MB, 3hr ago

Pseudocode

// Create SSTable
Flush(memtable):
    sorted = sort(memtable)        // Sort by key
    write(bloom_filter)             // For quick "not exists"
    write(sparse_index)             // Every 100th key → offset
    write(sorted_data)              // The actual key-values
    write(footer)                   // min_key, max_key, size

// Read from SSTable
Read(sstable, key):
    if not bloom_filter.contains(key):
        return None                 // Quick reject

    offset = binary_search(index, key)
    block = read_block(offset)
    return linear_search(block, key)

// Merge SSTables (Compaction)
Compact(sstables):
    // K-way merge, keep newest version
    merged = k_way_merge(sstables)
    return write_new_sstable(merged)

Key Takeaway