Hash Partition Join: Joining Tables Bigger Than RAM

When Both Tables Don't Fit in Memory

The Query We Want

-- Get user details with their listening history
SELECT u.name, u.country, l.song_id, l.plays
FROM Users u
JOIN Listens l ON u.user_id = l.user_id;

-- Problem: Can't load both tables in RAM!
-- Users: 6.4GB + Listens: 64GB = 70.4GB > 6.4GB RAM

Hash Partition Join: The Solution

Core Insight

If we partition both tables using the same hash function on the join key, then:

All rows from Table1 with key K go to partition h(K)
All rows from Table2 with key K go to partition h(K)
Therefore: Matching rows are always in the same partition number!

PSEUDOCODE: Hash Partition Join

HashPartitionJoin(T1, T2, key):    // Assume |T1| < |T2|
    B = √(|T2|/RAM)    // partition count

    // Phase 1: Partition both tables (uses B-1 buffers + 1 input)
    for row in T1: write to T1_p[h(row.key) % B]
    for row in T2: write to T2_p[h(row.key) % B]    // same hash!

    // Phase 2: Join matching partitions
    for i in 0..B-1:
        HT = hash_table(T1_p[i])    // build on smaller
        for row in T2_p[i]:          // probe with larger
            if HT[row.key]: output matches

    Approximate Cost: 3(N+M) + OUT IOs

Two-Table Join Example

Joining Users and Listens

SELECT u.*, l.* 
FROM Users u 
JOIN Listens l ON u.user_id = l.user_id
WHERE u.user_id = 42;

Step-by-step execution:

Given:

- Users: 100 pages (6.4GB)

- Listens: 1000 pages (64GB)  

- RAM: 100 pages (6.4GB)

- B = 10 partitions

Phase 1 - Partition:

1. Read each Users page

2. Hash each user_id: h(42)%10=2, h(89)%10=9, etc.

3. Write to appropriate partition file

4. Repeat for Listens table with SAME hash function

Result after Phase 1:

- 10 Users partition files (~640MB each)

- 10 Listens partition files (~6.4GB each)

- User 42 is in Users_partition_2

- ALL of User 42's listens are in Listens_partition_2

Phase 2 - Join:
For partition 0: Join(Users_p0, Listens_p0)
For partition 1: Join(Users_p1, Listens_p1)
...
For partition 9: Join(Users_p9, Listens_p9)

Each join fits in memory!

Key Point: user_id=42 will be in partition 2 in BOTH tables!

Multi-Table Joins

Joining Three Tables: Users, Listens, Songs

SELECT u.name, s.title, l.plays
FROM Users u
JOIN Listens l ON u.user_id = l.user_id
JOIN Songs s ON l.song_id = s.song_id;

PSEUDOCODE: Multi-Way Join

ThreeWayJoin(U, L, S):    // Assume |U| < |L| < |S|
    temp = HPJ(U, L, "user_id")    // U ⋈ L
    result = HPJ(temp, S, "song_id")    // temp ⋈ S
    return result

Approximate Cost: 3(|U|+|L|) + |temp| + 3(|temp|+|S|) + OUT

Handling Skew in HPJ

Problem: Hot Keys

What if user_id=1 (system account) has 10M listens? That is, Partition 0 and 2 fit in RAM, but Partition 1 does not.

Solution: Handle Skew

When a partition is too large (hot key):

Detect: Monitor partition sizes during Phase 1
Split: Re-partition the large partition with a different hash
Recurse: Apply HPJ recursively to the sub-partitions

Key Takeaways

HPJ enables joining tables >> RAM
Same hash → Same partition ensures correctness
Two phases: Partition then join
Handle skew with recursive partitioning
Foundation of distributed joins in Spark/Hadoop