Hash Partition Join: Joining Tables Bigger Than RAM

When Both Tables Don't Fit in Memory


The Query We Want


Hash Partition Join: The Solution

Core Insight

Here's the crux: partition both tables using the same hash function on the join key. This way, rows from Table1 and Table2 with the same key land in the same partition. It's like sorting your socks by color—suddenly, matching pairs are easy to find.

HashPartitionJoin(T1, T2, key):    // Assume |T1| < |T2|
    B = √(|T2|/RAM)    // partition count

    // Phase 1: Partition both tables (uses B-1 buffers + 1 input)
    for row in T1: write to T1_p[h(row.key) % B]
    for row in T2: write to T2_p[h(row.key) % B]    // same hash!

    // Phase 2: Join matching partitions
    for i in 0..B-1:
        HT = hash_table(T1_p[i])    // build on smaller
        for row in T2_p[i]:          // probe with larger
            if HT[row.key]: output matches

    Approximate Cost: 3(N+M) + OUT IOs

Two-Table Join Example

Joining Users and Listens

Key Point: user_id=42 will be in partition 2 in BOTH tables!


Multi-Table Joins

Joining Three Tables: Users, Listens, Songs

ThreeWayJoin(U, L, S):    // Assume |U| < |L| < |S|
    temp = HPJ(U, L, "user_id")    // U ⋈ L
    result = HPJ(temp, S, "song_id")    // temp ⋈ S
    return result

Approximate Cost: 3(|U|+|L|) + |temp| + 3(|temp|+|S|) + OUT

Handling Skew in HPJ

Problem: Hot Keys

What if user_id=1 (system account) has 10M listens? That is, Partition 0 and 2 fit in RAM, but Partition 1 does not.

Solution: Handle Skew

When a partition is too large (hot key):

  1. Detect: Monitor partition sizes during Phase 1

  2. Split: Re-partition the large partition with a different hash

  3. Recurse: Apply HPJ recursively to the sub-partitions


Key Takeaways

  1. HPJ enables joining tables >> RAM

  2. Same hash → Same partition ensures correctness

  3. Two phases: Partition then join

  4. Handle skew with recursive partitioning

  5. Foundation of distributed joins in Spark/Hadoop