JOINs: Combining Tables
Concept. A JOIN pairs rows from two tables on a predicate. The join type (INNER, LEFT, RIGHT, FULL) decides what happens to rows that find no match.
Intuition. INNER JOIN of Users and Listens on user_id returns one row per matching pair, so Mickey's 3 listens become 3 rows. LEFT JOIN keeps every user even if they have no listens; INNER drops them. The join type decides the fate of unmatched rows.
Example. Match users with their listening history by linking the Users and Listens tables on user_id.
⚠️ Note on Ordering: The visual examples show one possible ordering. Without ORDER BY, your results may appear in any order - even different each time you run the query!
INNER JOIN
Users Table (4 rows)
| user_id | name | ||
|---|---|---|---|
| 1 | Mickey | mickey@example.com | |
| 2 | Minnie | minnie@example.com | |
| 3 | Daffy | daffy@example.com | |
| 4 | Pluto | pluto@example.com |
Listens Table (9 rows)
| listen_id | user_id | song_id | rating | listen_time | |
|---|---|---|---|---|---|
| 1 | 1 | 1 | 4.5 | 2024-08-30 14:35:00 | |
| 2 | 1 | 2 | 4.2 | NULL | |
| 3 | 1 | 6 | 3.9 | 2024-08-29 10:15:00 | |
| 4 | 2 | 2 | 4.7 | NULL | |
| 5 | 2 | 7 | 4.6 | 2024-08-28 09:20:00 | |
| 6 | 2 | 8 | 3.9 | 2024-08-27 16:45:00 | |
| 7 | 3 | 1 | 2.9 | NULL | |
| 8 | 3 | 2 | 4.9 | 2024-08-26 12:30:00 | |
| 9 | 3 | 6 | NULL | NULL |
-- Get all users with their listening history (excludes users without listens)
SELECT u.user_id, u.name, l.listen_id, l.song_id, l.rating
FROM Users u
INNER JOIN Listens l
ON u.user_id = l.user_id
ORDER BY u.user_id, l.listen_id;
Output (9 rows)
| user_id | name | listen_id | song_id | rating | Annotations | |
|---|---|---|---|---|---|---|
| 1 | Mickey | 1 | 1 | 4.5 | Mickey → listen 1 | |
| 1 | Mickey | 2 | 2 | 4.2 | Mickey → listen 2 | |
| 1 | Mickey | 3 | 6 | 3.9 | Mickey → listen 3 | |
| 2 | Minnie | 4 | 2 | 4.7 | Minnie → listen 4 | |
| 2 | Minnie | 5 | 7 | 4.6 | Minnie → listen 5 | |
| 2 | Minnie | 6 | 8 | 3.9 | Minnie → listen 6 | |
| 3 | Daffy | 7 | 1 | 2.9 | Daffy → listen 7 | |
| 3 | Daffy | 8 | 2 | 4.9 | Daffy → listen 8 | |
| 3 | Daffy | 9 | 6 | NULL | Daffy → listen 9 |
-
Only matching rows make the cut.
-
Pluto (user_id=4) gets left out, with no listens to his name.
-
Each match results in a new row.
-
Think of INNER JOIN as a double for loop in Python.
Outer Joins: LEFT and RIGHT
LEFT JOIN
-- Get all users, including those without any listens
SELECT u.user_id, u.name, l.listen_id, l.rating
FROM Users u
LEFT JOIN Listens l
ON u.user_id = l.user_id
ORDER BY u.user_id, l.listen_id;
| user_id | name | listen_id | rating | Annotations | |
|---|---|---|---|---|---|
| 1 | Mickey | 1 | 4.5 | Mickey → listen 1 | |
| 1 | Mickey | 2 | 4.2 | Mickey → listen 2 | |
| 1 | Mickey | 3 | 3.9 | Mickey → listen 3 | |
| 2 | Minnie | 4 | 4.7 | Minnie → listen 4 | |
| 2 | Minnie | 5 | 4.6 | Minnie → listen 5 | |
| 2 | Minnie | 6 | 3.9 | Minnie → listen 6 | |
| 3 | Daffy | 7 | 2.9 | Daffy → listen 7 | |
| 3 | Daffy | 8 | 4.9 | Daffy → listen 8 | |
| 3 | Daffy | 9 | NULL | Daffy → listen 9 | |
| 4 | Pluto | NULL | NULL | Pluto preserved |
- All users, even Pluto, make the list.
- Non-matching rows are padded with NULLs.
- The left table is fully preserved.
RIGHT JOIN
-- Get all listens, including any without a matching user (none here)
SELECT u.user_id, u.name, l.listen_id, l.rating
FROM Users u
RIGHT JOIN Listens l
ON u.user_id = l.user_id
ORDER BY l.listen_id;
| user_id | name | listen_id | rating | Annotations | |
|---|---|---|---|---|---|
| 1 | Mickey | 1 | 4.5 | Listen 1 → Mickey | |
| 1 | Mickey | 2 | 4.2 | Listen 2 → Mickey | |
| 1 | Mickey | 3 | 3.9 | Listen 3 → Mickey | |
| 2 | Minnie | 4 | 4.7 | Listen 4 → Minnie | |
| 2 | Minnie | 5 | 4.6 | Listen 5 → Minnie | |
| 2 | Minnie | 6 | 3.9 | Listen 6 → Minnie | |
| 3 | Daffy | 7 | 2.9 | Listen 7 → Daffy | |
| 3 | Daffy | 8 | 4.9 | Listen 8 → Daffy | |
| 3 | Daffy | 9 | NULL | Listen 9 → Daffy |
- Every row from the right table stays.
- No match? Expect NULLs on the left.
- The LEFT JOIN in reverse.
Why this looks identical to INNER JOIN here: In our schema,
Listens.user_idis a foreign key intoUsers, so every listen has a valid matching user. There are no orphan listens to preserve. RIGHT JOIN only differs from INNER JOIN when the right table contains rows that don't match the left.When you'd see the difference: Imagine an upstream bug, a deleted user, or a data-import error left a listen with
user_id = 99(no such user). INNER JOIN would silently drop that listen. RIGHT JOIN would surface it as a row withuser_id=99andname=NULL. That is exactly the kind of red flag a data-quality audit needs.
FULL OUTER JOIN
-- Keep every user AND every listen, padding with NULLs on either side
SELECT u.user_id, u.name, l.listen_id, l.rating
FROM Users u
FULL OUTER JOIN Listens l
ON u.user_id = l.user_id;
FULL OUTER JOIN is the merger of LEFT + RIGHT. Every row from both tables stays, with NULLs filling the blanks. For our Users + Listens example the output looks identical to the LEFT JOIN above (Pluto preserved, no extra rows), since every Listen already belongs to a User. SQLite doesn't support it; combine a LEFT JOIN with UNION and a RIGHT JOIN instead.
Multiple JOINs
-- Get user + song details for high-rated listens (rating > 4.0)
SELECT u.name, s.title, s.artist, l.rating
FROM Users u
INNER JOIN Listens l ON u.user_id = l.user_id
INNER JOIN Songs s ON l.song_id = s.song_id
WHERE l.rating > 4.0
ORDER BY u.name, l.rating DESC;
| name | title | artist | rating | Annotations | |
|---|---|---|---|---|---|
| Daffy | Willow | Taylor Swift | 4.9 | U3 → L8 → S2 | |
| Mickey | Evermore | Taylor Swift | 4.5 | U1 → L1 → S1 | |
| Mickey | Willow | Taylor Swift | 4.2 | U1 → L2 → S2 | |
| Minnie | Willow | Taylor Swift | 4.7 | U2 → L4 → S2 | |
| Minnie | Yellow Sub | The Beatles | 4.6 | U2 → L5 → S7 |
-
Connect Users, Listens, and Songs.
-
Apply WHERE after all JOINs.
-
Each JOIN is a stepping stone.
-
Ideal for complex data relationships.
JOIN ON: Beyond Simple Equality
The ON clause is more flexible than you think. It accepts any logical condition, not just =.
Key Point: Use AND/OR to combine conditions. The ON clause is evaluated for each potential row pair.
Key Rules
-
Join Condition: The ON clause defines the relationship (e.g.,
Users.user_id = Listens.user_id). -
Join Order: Multiple JOINs are processed left-to-right, building intermediate results.
-
NULL Handling: Unmatched rows in outer joins are filled with NULLs.
-
Cartesian Product: A missing ON clause results in every possible combination, which is usually a mistake.
-
Column Ambiguity: Table aliases clarify which table a column belongs to.
-
Result Ordering: Without ORDER BY, rows return in an undefined order. Use ORDER BY for consistent results. (As we'll see later in Section 3, ordering incurs a performance cost due to sorting.)
Common Patterns
-
One-to-Many: One user can have many listens (Users → Listens).
-
Many-to-Many: Users listen to songs via a junction table (Users → Listens → Songs).
-
Optional Relationships: LEFT JOIN keeps all customers, even those without orders.
-
Chaining: Connect 3+ tables by joining results with the next table.
Common Mistakes
⚠️ Cartesian Product Warning: Missing ON clause multiplies rows!
4 users × 9 listens = 36 rows (see example below)
Forgetting JOIN conditions
Problem: Missing ON clause creates a cartesian product.
Wrong JOIN type for requirement
Problem: INNER JOIN excludes non-matching rows.
Ambiguous column names
Problem: Both tables have a user_id column.