JOINs: Combining Tables

Core Concept

JOINs are the SQL equivalent of a blind date, connecting rows from different tables based on a shared column.

⚠️ Note on Ordering: The visual examples show one possible ordering. Without ORDER BY, your results may appear in any order - even different each time you run the query!

INNER JOIN

Users Table (4 rows)

user_idnameemail
1Mickeymickey@example.com
2Minnieminnie@example.com
3Daffydaffy@example.com
4Plutopluto@example.com

Listens Table (9 rows)

listen_iduser_idsong_idratinglisten_time
1114.52024-08-30 14:35:00
2124.2NULL
3163.92024-08-29 10:15:00
4224.7NULL
5274.62024-08-28 09:20:00
6283.92024-08-27 16:45:00
7312.9NULL
8324.92024-08-26 12:30:00
936NULLNULL

Output (9 rows)

user_idnamelisten_idsong_idratingAnnotations
1Mickey114.5Mickey β†’ listen 1
1Mickey224.2Mickey β†’ listen 2
1Mickey363.9Mickey β†’ listen 3
2Minnie424.7Minnie β†’ listen 4
2Minnie574.6Minnie β†’ listen 5
2Minnie683.9Minnie β†’ listen 6
3Daffy712.9Daffy β†’ listen 7
3Daffy824.9Daffy β†’ listen 8
3Daffy96NULLDaffy β†’ listen 9

LEFT JOIN

user_idnamelisten_idsong_idratingAnnotations
1Mickey114.5Mickey β†’ listen 1
1Mickey224.2Mickey β†’ listen 2
1Mickey363.9Mickey β†’ listen 3
2Minnie424.7Minnie β†’ listen 4
2Minnie574.6Minnie β†’ listen 5
2Minnie683.9Minnie β†’ listen 6
3Daffy712.9Daffy β†’ listen 7
3Daffy824.9Daffy β†’ listen 8
3Daffy96NULLDaffy β†’ listen 9
4PlutoNULLNULLNULLPluto preserved

RIGHT JOIN

user_idnamelisten_idsong_idratingAnnotations
1Mickey114.5Listen 1 β†’ Mickey
1Mickey224.2Listen 2 β†’ Mickey
1Mickey363.9Listen 3 β†’ Mickey
2Minnie424.7Listen 4 β†’ Minnie
2Minnie574.6Listen 5 β†’ Minnie
2Minnie683.9Listen 6 β†’ Minnie
3Daffy712.9Listen 7 β†’ Daffy
3Daffy824.9Listen 8 β†’ Daffy
3Daffy96NULLListen 9 β†’ Daffy

FULL OUTER JOIN

user_idnamelisten_idsong_idratingAnnotations
1Mickey114.5Mickey β†’ listen 1
1Mickey224.2Mickey β†’ listen 2
1Mickey363.9Mickey β†’ listen 3
2Minnie424.7Minnie β†’ listen 4
2Minnie574.6Minnie β†’ listen 5
2Minnie683.9Minnie β†’ listen 6
3Daffy712.9Daffy β†’ listen 7
3Daffy824.9Daffy β†’ listen 8
3Daffy96NULLDaffy β†’ listen 9
4PlutoNULLNULLNULLPluto preserved

Multiple JOINs

nametitleartistratingAnnotations
DaffyWillowTaylor Swift4.9U3 β†’ L8 β†’ S2
MickeyEvermoreTaylor Swift4.5U1 β†’ L1 β†’ S1
MickeyWillowTaylor Swift4.2U1 β†’ L2 β†’ S2
MinnieWillowTaylor Swift4.7U2 β†’ L4 β†’ S2
MinnieYellow SubThe Beatles4.6U2 β†’ L5 β†’ S7

JOIN ON: Beyond Simple Equality

The ON clause is more flexible than you think. It accepts any logical condition, not just =.

Key Point: Use AND/OR to combine conditions. The ON clause is evaluated for each potential row pair.


Key Rules

  1. Join Condition: The ON clause defines the relationship (e.g., Users.user_id = Listens.user_id).

  2. Join Order: Multiple JOINs are processed left-to-right, building intermediate results.

  3. NULL Handling: Unmatched rows in outer joins are filled with NULLs.

  4. Cartesian Product: A missing ON clause results in every possible combinationβ€”usually a mistake.

  5. Column Ambiguity: Table aliases clarify which table a column belongs to.

  6. Result Ordering: Without ORDER BY, rows return in an undefined order. Use ORDER BY for consistent results. (As we'll see later in Section 3, ordering incurs a performance cost due to sorting.)

Common Patterns

Common Mistakes

⚠️ Cartesian Product Warning: Missing ON clause multiplies rows!
4 users Γ— 9 listens = 36 rows (see example below)

Forgetting JOIN conditions

Problem: Missing ON clause creates a cartesian product.

Wrong JOIN type for requirement

Problem: INNER JOIN excludes non-matching rows.

Ambiguous column names

Problem: Both tables have a user_id column.