JOINs: Combining Tables
Core Concept
- Example: Match users with their listening history by linking the Users and Listens tables on
user_id.
JOINs are the SQL equivalent of a blind date, connecting rows from different tables based on a shared column.
INNER JOIN
Users Table (4 rows)
| user_id | name | ||
|---|---|---|---|
| 1 | Mickey | mickey@example.com | |
| 2 | Minnie | minnie@example.com | |
| 3 | Daffy | daffy@example.com | |
| 4 | Pluto | pluto@example.com |
Listens Table (9 rows)
| listen_id | user_id | song_id | rating | listen_time | |
|---|---|---|---|---|---|
| 1 | 1 | 1 | 4.5 | 2024-08-30 14:35:00 | |
| 2 | 1 | 2 | 4.2 | NULL | |
| 3 | 1 | 6 | 3.9 | 2024-08-29 10:15:00 | |
| 4 | 2 | 2 | 4.7 | NULL | |
| 5 | 2 | 7 | 4.6 | 2024-08-28 09:20:00 | |
| 6 | 2 | 8 | 3.9 | 2024-08-27 16:45:00 | |
| 7 | 3 | 1 | 2.9 | NULL | |
| 8 | 3 | 2 | 4.9 | 2024-08-26 12:30:00 | |
| 9 | 3 | 6 | NULL | NULL |
Output (9 rows)
| user_id | name | listen_id | song_id | rating | Annotations | |
|---|---|---|---|---|---|---|
| 1 | Mickey | 1 | 1 | 4.5 | Mickey β listen 1 | |
| 1 | Mickey | 2 | 2 | 4.2 | Mickey β listen 2 | |
| 1 | Mickey | 3 | 6 | 3.9 | Mickey β listen 3 | |
| 2 | Minnie | 4 | 2 | 4.7 | Minnie β listen 4 | |
| 2 | Minnie | 5 | 7 | 4.6 | Minnie β listen 5 | |
| 2 | Minnie | 6 | 8 | 3.9 | Minnie β listen 6 | |
| 3 | Daffy | 7 | 1 | 2.9 | Daffy β listen 7 | |
| 3 | Daffy | 8 | 2 | 4.9 | Daffy β listen 8 | |
| 3 | Daffy | 9 | 6 | NULL | Daffy β listen 9 |
-
Only matching rows make the cut.
-
Pluto (user_id=4) gets left outβno listens to his name.
-
Each match results in a new row.
-
Think of INNER JOIN as a double for loop in Python.
LEFT JOIN
| user_id | name | listen_id | song_id | rating | Annotations | |
|---|---|---|---|---|---|---|
| 1 | Mickey | 1 | 1 | 4.5 | Mickey β listen 1 | |
| 1 | Mickey | 2 | 2 | 4.2 | Mickey β listen 2 | |
| 1 | Mickey | 3 | 6 | 3.9 | Mickey β listen 3 | |
| 2 | Minnie | 4 | 2 | 4.7 | Minnie β listen 4 | |
| 2 | Minnie | 5 | 7 | 4.6 | Minnie β listen 5 | |
| 2 | Minnie | 6 | 8 | 3.9 | Minnie β listen 6 | |
| 3 | Daffy | 7 | 1 | 2.9 | Daffy β listen 7 | |
| 3 | Daffy | 8 | 2 | 4.9 | Daffy β listen 8 | |
| 3 | Daffy | 9 | 6 | NULL | Daffy β listen 9 | |
| 4 | Pluto | NULL | NULL | NULL | Pluto preserved |
-
All users, even Pluto, make the list.
-
Non-matching rows are padded with NULLs.
-
The left table is fully preserved.
-
More inclusive than INNER JOIN.
RIGHT JOIN
| user_id | name | listen_id | song_id | rating | Annotations | |
|---|---|---|---|---|---|---|
| 1 | Mickey | 1 | 1 | 4.5 | Listen 1 β Mickey | |
| 1 | Mickey | 2 | 2 | 4.2 | Listen 2 β Mickey | |
| 1 | Mickey | 3 | 6 | 3.9 | Listen 3 β Mickey | |
| 2 | Minnie | 4 | 2 | 4.7 | Listen 4 β Minnie | |
| 2 | Minnie | 5 | 7 | 4.6 | Listen 5 β Minnie | |
| 2 | Minnie | 6 | 8 | 3.9 | Listen 6 β Minnie | |
| 3 | Daffy | 7 | 1 | 2.9 | Listen 7 β Daffy | |
| 3 | Daffy | 8 | 2 | 4.9 | Listen 8 β Daffy | |
| 3 | Daffy | 9 | 6 | NULL | Listen 9 β Daffy |
-
Every row from the right table stays.
-
No match? Expect NULLs on the left.
-
The right table calls the shots.
-
It's the LEFT JOIN in reverse.
FULL OUTER JOIN
| user_id | name | listen_id | song_id | rating | Annotations | |
|---|---|---|---|---|---|---|
| 1 | Mickey | 1 | 1 | 4.5 | Mickey β listen 1 | |
| 1 | Mickey | 2 | 2 | 4.2 | Mickey β listen 2 | |
| 1 | Mickey | 3 | 6 | 3.9 | Mickey β listen 3 | |
| 2 | Minnie | 4 | 2 | 4.7 | Minnie β listen 4 | |
| 2 | Minnie | 5 | 7 | 4.6 | Minnie β listen 5 | |
| 2 | Minnie | 6 | 8 | 3.9 | Minnie β listen 6 | |
| 3 | Daffy | 7 | 1 | 2.9 | Daffy β listen 7 | |
| 3 | Daffy | 8 | 2 | 4.9 | Daffy β listen 8 | |
| 3 | Daffy | 9 | 6 | NULL | Daffy β listen 9 | |
| 4 | Pluto | NULL | NULL | NULL | Pluto preserved |
-
Combines everything from both tables.
-
NULLs fill the blanks.
-
LEFT and RIGHT JOINs combined.
-
SQLite doesn't support FULL OUTER JOIN. Use UNION to mimic it.
Multiple JOINs
| name | title | artist | rating | Annotations | |
|---|---|---|---|---|---|
| Daffy | Willow | Taylor Swift | 4.9 | U3 β L8 β S2 | |
| Mickey | Evermore | Taylor Swift | 4.5 | U1 β L1 β S1 | |
| Mickey | Willow | Taylor Swift | 4.2 | U1 β L2 β S2 | |
| Minnie | Willow | Taylor Swift | 4.7 | U2 β L4 β S2 | |
| Minnie | Yellow Sub | The Beatles | 4.6 | U2 β L5 β S7 |
-
Connect Users, Listens, and Songs.
-
Apply WHERE after all JOINs.
-
Each JOIN is a stepping stone.
-
Ideal for complex data relationships.
JOIN ON: Beyond Simple Equality
The ON clause is more flexible than you think. It accepts any logical condition, not just =.
Key Point: Use AND/OR to combine conditions. The ON clause is evaluated for each potential row pair.
Key Rules
-
Join Condition: The ON clause defines the relationship (e.g.,
Users.user_id = Listens.user_id). -
Join Order: Multiple JOINs are processed left-to-right, building intermediate results.
-
NULL Handling: Unmatched rows in outer joins are filled with NULLs.
-
Cartesian Product: A missing ON clause results in every possible combinationβusually a mistake.
-
Column Ambiguity: Table aliases clarify which table a column belongs to.
-
Result Ordering: Without ORDER BY, rows return in an undefined order. Use ORDER BY for consistent results. (As we'll see later in Section 3, ordering incurs a performance cost due to sorting.)
Common Patterns
-
One-to-Many: One user can have many listens (Users β Listens).
-
Many-to-Many: Users listen to songs via a junction table (Users β Listens β Songs).
-
Optional Relationships: LEFT JOIN keeps all customers, even those without orders.
-
Chaining: Connect 3+ tables by joining results with the next table.
Common Mistakes
4 users Γ 9 listens = 36 rows (see example below)
Forgetting JOIN conditions
Problem: Missing ON clause creates a cartesian product.
Wrong JOIN type for requirement
Problem: INNER JOIN excludes non-matching rows.
Ambiguous column names
Problem: Both tables have a user_id column.