SSTables: Sorted String Tables
Immutable Files That Power Modern Databases
Core idea: Never update files in-place. Instead, write new sorted files and merge them later.
Visual: Anatomy of an SSTable
Pseudocode
// Create SSTable
Flush(memtable):
sorted = sort(memtable) // Sort by key
write(bloom_filter) // For quick "not exists"
write(sparse_index) // Every 100th key → offset
write(sorted_data) // The actual key-values
write(footer) // min_key, max_key, size
// Read from SSTable
Read(sstable, key):
if not bloom_filter.contains(key):
return None // Quick reject
offset = binary_search(index, key)
block = read_block(offset)
return linear_search(block, key)
// Merge SSTables (Compaction)
Compact(sstables):
// K-way merge, keep newest version
merged = k_way_merge(sstables)
return write_new_sstable(merged)
Real-World Systems Using SSTables
Distributed SQL Databases
Google Spanner
-
Globally distributed SQL database
-
Uses SSTables for storage layer
-
Combines ACID transactions with SSTables
-
Paxos for consensus, SSTables for storage
CockroachDB
-
Distributed SQL built on RocksDB (SSTables)
-
ACID transactions over LSM storage
-
Range partitioning with SSTables
TiDB
-
MySQL-compatible distributed SQL
-
TiKV storage engine uses RocksDB/SSTables
-
Hybrid transactional/analytical processing
NoSQL Systems
Apache Cassandra
-
Wide-column store built on SSTables
-
Each column family = collection of SSTables
-
Compaction strategies: Size-tiered, Leveled, Time-window
HBase (Hadoop Database)
-
Uses SSTables (called HFiles)
-
Stores in HDFS for distribution
-
Major/minor compaction
RocksDB (Facebook)
-
Embedded key-value store
-
LSM tree with SSTables
-
Powers many databases (CockroachDB, TiDB, Kafka Streams)
Google Bigtable
-
Original SSTable implementation
-
Tablets = range of rows in SSTables
-
GFS/Colossus for storage
Key Takeaway
SSTables turn random writes into sequential writes by batching updates in memory and flushing sorted, immutable files to disk. This simple idea powers Spanner, CockroachDB, Cassandra, and many other modern databases.