Putting It All Together: Saving $550,000 on One Query

Concept. A single unindexed weekly aggregation query running 100 times a day on a 1 TB table can cost a company $550,000 a year in cloud compute. Stacking the right optimizations (index, columnar storage, compression) drops the bill to under $400.

Intuition. A naive Hash Partition Join across a full 1 TB of listen history costs ~49,200 IOs per run. Adding a B+Tree index on listen_time prunes the query to the 1% of rows in this week's range, which cuts the cost to ~554 IOs. Switching to columnar storage reads only the 4 columns the query actually touches, cutting further to ~60 IOs. Compression shrinks those columns to ~35 IOs. A 1,400× improvement overall.

Putting It All Together Reading Time: 15 mins

The Bill

Goal: Understand how indexing, columnar storage, and compression stack to tackle a costly query disaster.

An unoptimized weekly "Trending Genres" aggregation query runs 100 times daily at 9 hours per run, costing $1,500 a day (about $550,000 a year) in AWS compute.

System Archeology: The Schema

To optimize the workload, we start by laying out the physical schema.

Component	Stats	Disk footprint (64 MB pages)
Songs table	~50 M songs	400 pages (25 GB)
Listens table	~10 B listens	16,000 pages (1 TB)
Weekly scope	Oct 21–27 filter	150 pages (~1% of data)
Memory (B)	Buffer pool	20 pages (1.28 GB)

All calculations below use a 64 MB page size.

Problem 0: The Baseline Investigation

Problem. The weekly trending report takes 9 hours. Before optimizing, we need the baseline cost. What is the IO cost of a Hash Partition Join (HPJ) on these two tables? Should the optimizer choose a Block Nested Loop Join (BNLJ) instead?

Show Solution

Option 1: Block Nested Loop Join (BNLJ)

If the optimizer picked BNLJ, it would be a catastrophe.

Formula: M + (M / B) × N (M = outer = Songs pages, N = inner = Listens pages)
Math: 400 + (400 / 20) × 16,000 = 320,400 IOs
Why: BNLJ scans the 1 TB Listens table once for every block of the 25 GB Songs table. This is the "disaster scenario" for big-data joins.

Option 2: Hash Partition Join (HPJ)

HPJ is better, but still wasteful.

Formula: 3 × (M + N)
Math: 3 × (400 + 16,000) = 49,200 IOs
Why: HPJ is efficient for joining two large tables, but here we only need 1% of the data. Scanning the entire 1 TB to find that 1% is massive waste.

Problem 1: The Index Win

Problem. Only about 150 of the 16,000 Listens pages contain data for Oct 21–27. How do we skip the other pages, and what is the new IO cost?

Show Solution

Logic. Create a B+Tree index on Listens(listen_time).
Design. The index allows a targeted scan instead of a full join.
IO math:
- Scan Songs: 400 IOs
- Index lookup: ~4 IOs
- Fetch filtered Listens: 150 IOs
- Total: ~554 IOs
Why. Indexes prune the search space. By jumping directly to the start date and reading only the 1% target data, we get a 90× improvement over plain HPJ.

Problem 2: The Column Store Win

Problem. Even with an index, row-based storage loads the whole 150-byte row just to read the 24 bytes we actually need (song_id, user_id, listen_time, duration). How do we eliminate this column waste, and what is the estimated IO savings?

Show Solution

Logic. Switch to a columnar store (BigQuery, Parquet).
Design. Each column is physically stored in its own set of pages. The engine reads only the columns referenced in the SELECT and WHERE clauses, ignoring IP addresses, device IDs, and anything else.
IO math:
- We read only 1/6th of the row width.
- Total: ~60 IOs (Listens 25 + Songs 35)
Why. Columnar storage is optimized for aggregation. Reading only the relevant columns cuts IO by another 9×.

Problem 3: The Compression Win

Problem. How do we further shrink the genre and rating columns without losing data?

Show Solution

Technique 1 (Dictionary encoding). Map the 200 unique genres to 1-byte IDs (16 pages down to 1 page).
Technique 2 (Bit packing). Ratings are 1–5, so store them in 4 bits instead of a 32-bit integer (7 pages down to 1 page).
Technique 3 (Run-length encoding). Since the data is sorted, compress repeating user IDs (13 pages down to 5 pages).
Final result: ~35 IOs, 20 seconds per run, ~$400 a year.
Why. Domain-specific compression. The more you know about the shape of your data, the more of it you can throw away.

Summary: The Compound ROI

Three independent optimizations multiplied together take a $550,000 disaster down to $400.

Figure 1. The same weekly aggregation in four stages, left to right. The red baseline is the disaster: a Hash Partition Join scans the full 1 TB at 49,200 IOs and about $550,000 a year. Each optimization removes a different, independent kind of waste, an index cuts the rows scanned (×90), columnar storage cuts the columns read (×9), and compression cuts the bytes stored (×2), so the green final stage is roughly 1,400× cheaper at about $400 a year. The point is orthogonality: because the three wins attack different waste they multiply rather than overlap, so the savings compound as a product, not a sum.

Total impact: 1,400× improvement.

The lesson isn't that any single technique is magic. It's that indexes, columnar storage, and compression are orthogonal: each one attacks a different kind of waste (scanning too many rows, reading too many columns, storing too many bytes), and they multiply when stacked.

Takeaway: One unindexed weekly aggregation can cost $550K a year; because an index, columnar storage, and compression attack independent waste, stacking them multiplies to a ~1,400x cut, the same query for under $400.

Indexing Quiz → Apply the four index types and these optimizations to a few practice queries.

Putting It All Together: Saving $550,000 on One Query

The Bill

System Archeology: The Schema

Problem 0: The Baseline Investigation

Problem 1: The Index Win

Problem 2: The Column Store Win

Problem 3: The Compression Win

Summary: The Compound ROI

Next