Storage & Paging: The IO Cost Calculator
How Does One Machine Process 128GB with Only 16GB RAM?
Storage Hierarchy: Speed, Cost, and Capacity
Key Observations: RAM is 100× faster than SSD, 100,000× faster than HDD • Cost/TB and Speed are inversely related.
IO Cost Definitions
Access Latency: Time to initiate an IO operation before data transfer begins
- The fixed overhead for starting any IO operation
- Examples: SSD access (10μs), HDD seeks (10ms), RAM access (100ns)
Throughput: Data transfer rate once operation begins
- The sustained rate at which data moves after access starts
- Examples: SSD (5 GB/s), RAM (100 GB/s), HDD (100 MB/s)
Key Insight: For large pages (64MB), transfer time dominates access time. For small pages, access time dominates.
Refresher: Read your OS materials on how the OS' IO controllers work. DBs rely on OS for those details.
Modern Reality: CPUs/GPUs Can't Escape the Disk Bottleneck
The 10,000,000× Gap
10ms
HDD seek time
10μs
SSD latency
100ns
RAM access
1ns
CPU/GPU cycle
The Math: 1 HDD seek = 10,000,000 GPU operations!
GPUs Are Data Hungry
# Training a 100GB model
load_data_from_ssd = 20 seconds # 100GB ÷ 5GB/s
transfer_to_gpu = 3 seconds # 100GB ÷ 32GB/s (PCIe 4.0)
gpu_training_epoch = 0.5 seconds # Blazing fast compute!
# Where does time go?
# 87% loading data from SSD
# 13% transferring to GPU over PCIe
# 2% actual GPU compute
The Bottleneck Chain:
- Data lives on disk - Your 1TB dataset won't fit in 80GB GPU memory (e.g. A100 has 80GB in 2025)
- PCI interfaces between hardware components are narrow - 32GB/s seems fast until you have 1TB to move
- Compute is free - 312 TFLOPS means compute takes ~0 time