Hash Partition Join: Joining Tables Bigger Than RAM
When Both Tables Don't Fit in Memory
The Query We Want
Hash Partition Join: The Solution
Core Insight
Here's the crux: partition both tables using the same hash function on the join key. This way, rows from Table1 and Table2 with the same key land in the same partition. It's like sorting your socks by color—suddenly, matching pairs are easy to find.
HashPartitionJoin(T1, T2, key): // Assume |T1| < |T2|
B = √(|T2|/RAM) // partition count
// Phase 1: Partition both tables (uses B-1 buffers + 1 input)
for row in T1: write to T1_p[h(row.key) % B]
for row in T2: write to T2_p[h(row.key) % B] // same hash!
// Phase 2: Join matching partitions
for i in 0..B-1:
HT = hash_table(T1_p[i]) // build on smaller
for row in T2_p[i]: // probe with larger
if HT[row.key]: output matches
Approximate Cost: 3(N+M) + OUT IOs
Two-Table Join Example
Joining Users and Listens
Key Point: user_id=42 will be in partition 2 in BOTH tables!
Multi-Table Joins
Joining Three Tables: Users, Listens, Songs
ThreeWayJoin(U, L, S): // Assume |U| < |L| < |S|
temp = HPJ(U, L, "user_id") // U ⋈ L
result = HPJ(temp, S, "song_id") // temp ⋈ S
return result
Approximate Cost: 3(|U|+|L|) + |temp| + 3(|temp|+|S|) + OUT
Handling Skew in HPJ
Problem: Hot Keys
What if user_id=1 (system account) has 10M listens? That is, Partition 0 and 2 fit in RAM, but Partition 1 does not.
Solution: Handle Skew
When a partition is too large (hot key):
-
Detect: Monitor partition sizes during Phase 1
-
Split: Re-partition the large partition with a different hash
-
Recurse: Apply HPJ recursively to the sub-partitions
Key Takeaways
-
HPJ enables joining tables >> RAM
-
Same hash → Same partition ensures correctness
-
Two phases: Partition then join
-
Handle skew with recursive partitioning
-
Foundation of distributed joins in Spark/Hadoop