Case Study 3.3: Tracking High-Volume Activity with LSM Trees

Case Study 3.3 Reading Time: 8 mins

How does Spotify track billions of "Likes" and "Plays"?

Goal: Design data structures for high-volume incremental updates.

When a Spotify user interacts with a song, that action needs to be logged instantly. With a user base of 700 million, the challenge isn't just technical—it's a relentless barrage of write operations that would overwhelm traditional systems.

The Bottleneck: Random Writes in B+Trees

In a B+Tree, each update requires a disk seek to find the right leaf node. Imagine having to update the play count for any of 100 million songs. Each update means searching through the tree, loading the right page, and changing the value. Multiply this by millions of updates per second, and you hit a wall with standard hardware.


The Solution: The "No-Read" Write

Enter the LSM Tree, the workhorse behind systems like Google's BigTable and Apache Cassandra. These systems dodge the "Read-before-Write" penalty by eliminating the need for a disk search during a write.

How it Works: RAM, Flush, and Merge

  1. Buffered Writes (MemTable): Each "Like" is captured in a high-speed RAM buffer ($O(1)$). No need to check existing totals; just log the increment.

  2. Sequential Flushes (SSTables): Once RAM fills up, data is sorted and written to disk as a Sorted String Table (SSTable). Sorting in RAM allows for sequential disk writes, which are much faster than random ones.

  3. Merge on Read: To get total likes, the system sums fragments from the MemTable and all SSTables on disk.

Visual Reference: Check out the "Production Path" and "Consumption Path" diagrams in the LSM Tree Indexes section for a detailed view.


Beyond Counters: Customizing Merge Behavior

LSM trees aren't limited to counting. They can be tailored for various update types.

Takeaway: By treating the database as a Log (append-only) rather than a Ledger (overwriting), LSM trees keep write performance steady, regardless of data size. Whether you manage 1,000 songs or 100 billion, the write speed stays constant.