Storage & Paging: How Does One Machine Process 128 GB with Only 16 GB of RAM?

Concept. Disks and SSDs read in fixed-size pages (typically 64 MB in big-data systems). One IO costs the same whether you read one byte or a full page, so algorithms minimise page count, not byte count.

Intuition. How do you process 128 GB on a machine with 16 GB of RAM? You read it 64 MB at a time. The disk hands you a full page on every IO whether you want one byte or all of it, so algorithms count pages, not bytes, and the page that holds Mickey's listens costs the same whether you're after one row or fifty.

Storage Hierarchy: Speed, Cost, and Capacity

The storage hierarchy is a ruthless game of trade-offs. Speed, cost, and capacity are the players, each vying for dominance. Here's the cold, hard truth: the faster the storage, the more it costs, and the less of it you get.

Storage Level Access Latency Throughput Cost per TB Typical Capacity Use Case
CPU Registers 1 cycle - - < 1KB Immediate values
L1/L2 Cache 1-10ns - - 64KB - 8MB Hot instructions
RAM (Buffer Pool) 100ns 100 GB/s $3,500 16GB Working set pages
SSD 10μs 5 GB/s $75 512GB Active tables
HDD 10ms 100 MB/s $25 4TB Cold storage
Network Storage 1μs 10 GB/s Variable Distributed cache

Key Observations: RAM is 100× faster than SSD, 100,000× faster than HDD • Cost/TB and Speed are inversely related.


IO Cost Definitions

Understanding IO costs is about recognizing the fixed overheads and the sustained rates.

Access Latency: Time to initiate an IO operation before data transfer begins

  • The fixed overhead for starting any IO operation

  • Examples: SSD access (10μs), HDD seeks (10ms), RAM access (100ns)

Throughput: Data transfer rate once operation begins

  • The sustained rate at which data moves after access starts

  • Examples: SSD (5 GB/s), RAM (100 GB/s), HDD (100 MB/s)

Key Insight: For large pages (64MB), transfer time dominates access time. For small pages, access time dominates.

Refresher: Read your OS materials on how the OS' IO controllers work. DBs rely on OS for those details.


Modern Reality: CPUs/GPUs Can't Escape the Disk Bottleneck

The 10,000,000× Gap

The gap between storage and compute is staggering. A single HDD seek can cost you 10 million GPU operations.

10ms

HDD seek time

10μs

SSD latency

100ns

RAM access

1ns

CPU/GPU cycle

The Math: 1 HDD seek = 10,000,000 GPU operations!

GPUs Are Data Hungry

The bottleneck chain is relentless:

  1. Data lives on disk - Your 1TB dataset won't fit in 80GB GPU memory (e.g. A100 has 80GB in 2025)

  2. PCI interfaces between hardware components are narrow - 32GB/s seems fast until you have 1TB to move

  3. Compute is free - 312 TFLOPS means compute takes ~0 time