Case Study 1A.5: Key-Value Stores
Key-Value Stores
When milliseconds aren't fast enough. Using Key-Value stores (like Redis or DynamoDB) to cache hot data and protect your primary databases from UI-driven traffic spikes.
Why does Spotify store songs and lyrics in a key-value database?
Goal: Learn when to use key-value databases
Spotify's architecture isn't just a random collection of tech buzzwords. It's a calculated strategy to handle different types of data with precision.
The Data Spectrum: Structured vs. Blobs
1. The Structured Core (Tabular) Imagine a world of neatly typed columns: integers, timestamps, booleans.
- The Reality: SQL engines are built to slice and dice billions of these tiny rows. This is your core product logic—tracking who listened to what song, and when.
2. The Massive "Blobs" (Unstructured & Semi-structured) Now think of a 10 GB video file or a sprawling JSON blob with a podcast's details.
- The Reality: SQL engines excel at scanning rows in memory; huge blobs disrupt that efficiency. It's like cramming a video into a spreadsheet cell. Better to store the video elsewhere and just track its location.
This is why Spotify divides its architecture.
The Spotify Strategy: Key-Value Stores
For the unstructured and semi-structured data, Spotify turns to Key-Value databases like Google’s Bigtable or AWS’ DynamoDB.
How it works: Picture a vast locker room. Each locker has a unique number (the Key) and inside is the item (the Value).
-
Example: Key =
song_123, Value =audio_file.mp3. -
Retrieval: Hit play, and Spotify uses the song's ID to swiftly fetch the audio from the key-value store and stream it to you.
Alternatively, the actual file might sit in a distributed file system (like Amazon S3). The key-value database acts as an index (Key = song_id, Value = S3-location). This setup scales billions of files while keeping performance sharp.
SQL vs. Key-Value: The Trade-offs
| Feature | SQL Database (Conventional) | Key-Value Store |
|---|---|---|
| Querying | Rich capability: Joins, Aggregations, Complex filters. | Simple: Given a key, it returns the value. |
| Schema | Rigid: Needs a predefined structure. | Flexible: No predefined schema required. |
| Integrity | High: Enforced constraints & transactions. | Low: App has to handle integrity. |
| Best For | Structured data with complex relationships. | Massive unstructured/semi-structured data (like lyrics or media). |
Takeaway: Storing each line of music lyrics in separate SQL columns isn't practical. Instead, store it as one semi-structured blob in a Key-Value store: {key: song_id, value: lyrics_blob}. (Or in JSONB in modern SQL databases.)
The 2026 Perspective: Converging Worlds
Modern Hybridization: These are traditional distinctions. While Spotify's scale justifies dedicated Key-Value stores, for most companies, these lines blur in 2026.
Ready to see how to protect all this data? Jump directly into Data Security.