Hash Partition Join: Joining Tables Bigger Than RAM
When Both Tables Don't Fit in Memory
The Query We Want
-- Get user details with their listening history
SELECT u.name, u.country, l.song_id, l.plays
FROM Users u
JOIN Listens l ON u.user_id = l.user_id;
-- Problem: Can't load both tables in RAM!
-- Users: 6.4GB + Listens: 64GB = 70.4GB > 6.4GB RAM
Hash Partition Join: The Solution
Core Insight
If we partition both tables using the same hash function on the join key, then:
-
All rows from Table1 with key K go to partition h(K)
-
All rows from Table2 with key K go to partition h(K)
-
Therefore: Matching rows are always in the same partition number!
PSEUDOCODE: Hash Partition Join
HashPartitionJoin(T1, T2, key): // Assume |T1| < |T2|
B = √(|T2|/RAM) // partition count
// Phase 1: Partition both tables (uses B-1 buffers + 1 input)
for row in T1: write to T1_p[h(row.key) % B]
for row in T2: write to T2_p[h(row.key) % B] // same hash!
// Phase 2: Join matching partitions
for i in 0..B-1:
HT = hash_table(T1_p[i]) // build on smaller
for row in T2_p[i]: // probe with larger
if HT[row.key]: output matches
Approximate Cost: 3(N+M) + OUT IOs
Two-Table Join Example
Joining Users and Listens
SELECT u.*, l.*
FROM Users u
JOIN Listens l ON u.user_id = l.user_id
WHERE u.user_id = 42;
Step-by-step execution:
Given:
- Users: 100 pages (6.4GB)
- Listens: 1000 pages (64GB)
- RAM: 100 pages (6.4GB)
- B = 10 partitions
Phase 1 - Partition:
1. Read each Users page
2. Hash each user_id: h(42)%10=2, h(89)%10=9, etc.
3. Write to appropriate partition file
4. Repeat for Listens table with SAME hash function
Result after Phase 1:
- 10 Users partition files (~640MB each)
- 10 Listens partition files (~6.4GB each)
- User 42 is in Users_partition_2
- ALL of User 42's listens are in Listens_partition_2
Phase 2 - Join:
For partition 0: Join(Users_p0, Listens_p0)
For partition 1: Join(Users_p1, Listens_p1)
...
For partition 9: Join(Users_p9, Listens_p9)
Each join fits in memory!
Key Point: user_id=42 will be in partition 2 in BOTH tables!
Multi-Table Joins
Joining Three Tables: Users, Listens, Songs
SELECT u.name, s.title, l.plays
FROM Users u
JOIN Listens l ON u.user_id = l.user_id
JOIN Songs s ON l.song_id = s.song_id;
PSEUDOCODE: Multi-Way Join
ThreeWayJoin(U, L, S): // Assume |U| < |L| < |S|
temp = HPJ(U, L, "user_id") // U ⋈ L
result = HPJ(temp, S, "song_id") // temp ⋈ S
return result
Approximate Cost: 3(|U|+|L|) + |temp| + 3(|temp|+|S|) + OUT
Handling Skew in HPJ
Problem: Hot Keys
What if user_id=1 (system account) has 10M listens? That is, Partition 0 and 2 fit in RAM, but Partition 1 does not.
Solution: Handle Skew
When a partition is too large (hot key):
-
Detect: Monitor partition sizes during Phase 1
-
Split: Re-partition the large partition with a different hash
-
Recurse: Apply HPJ recursively to the sub-partitions
Key Takeaways
-
HPJ enables joining tables >> RAM
-
Same hash → Same partition ensures correctness
-
Two phases: Partition then join
-
Handle skew with recursive partitioning
-
Foundation of distributed joins in Spark/Hadoop