SELECT-FROM-WHERE: The SQL Foundation

Concept. SELECT lists the columns you want back; FROM names the table; WHERE keeps only the rows whose predicate is true.

Intuition. Start with the Listens table (FROM), drop the rows whose predicate is false (WHERE), keep only the columns you named (SELECT). The clauses execute in that order: FROM → WHERE → SELECT.

Example. Use WHERE to trim the table to just the rows you need (e.g. 9 rows down to 5). Then SELECT the columns you care about (e.g. from all columns to just 3).

Basic Query Structure: FloW...S (like water)

-- Get listens with high ratings (above 4.0)
SELECT user_id, song_id, rating    -- Step last: Choose only 3 columns for output
FROM Listens                       -- Step 1: Picks table with 9 rows
WHERE rating > 4.0                 -- Step 2: Filters 9 rows; keeps subset of rows

Execution Order: FROM → WHERE → SELECT (FWS order, not SFW. FloWS like water)
Row Filtering: WHERE evaluates each row individually
Column Selection: SELECT determines output columns

Debug Table (row by row)

listen_id	user_id	song_id	rating	Action	Show in Result
1	1	1	4.5	✓ TRUE, Keep	✓
2	1	2	4.2	✓ TRUE, Keep	✓
3	1	6	3.9	✗ FALSE, Remove	✗
4	2	2	4.7	✓ TRUE, Keep	✓
5	2	7	4.6	✓ TRUE, Keep	✓
6	2	8	3.9	✗ FALSE, Remove	✗
7	3	1	2.9	✗ FALSE, Remove	✗
8	3	2	4.9	✓ TRUE, Keep	✓
9	3	6	NULL	? UNKNOWN, Remove	✗

Output

user_id	song_id	rating
1	1	4.5
1	2	4.2
2	2	4.7
2	7	4.6
3	2	4.9

Why SQL FloWS Like This?

SQL is declarative. But to start, think of it like Python:

def high_ratings(Listens):
    output = []
    for row in Listens:              # FROM Listens
        if row.rating > 4.0:         # WHERE rating > 4.0  
            output += [row.user_id,  # SELECT user_id,
                       row.song_id,  #        song_id,
                       row.rating]   #        rating
    return output

Declarative (SQL): "Get me listens with rating > 4.0" → Database figures out where/how

Procedural (Python): "Here's my list data struct from this file on this machine. Loop through each row, check condition, build output. Parallelize with these step1, step2, step3." → You spell out each step

Key insight: SELECT is like the final mapping - it comes last in execution but defines what you return!

SELECT * vs Specific Columns

-- Get all columns for user 1's listening history
SELECT * 
FROM Listens 
WHERE user_id = 1

listen_id	user_id	song_id	rating	listen_time
1	1	1	4.5	2024-08-30 14:35:00
2	1	2	4.2	NULL
3	1	6	3.9	2024-08-29 10:15:00

-- Get only user_id and rating for user 1
SELECT user_id, rating 
FROM Listens 
WHERE user_id = 1

	user_id	rating
	1	4.5
	1	4.2
	1	3.9

WHERE with Multiple Conditions

-- Get user 1's high-rated songs (rating > 4.0)
SELECT user_id, song_id, rating
FROM Listens
WHERE user_id = 1 AND rating > 4.0

listen_id	user_id	rating	AND Result	Action
1	1	4.5	✓ TRUE	Keep
2	1	4.2	✓ TRUE	Keep
3	1	3.9	✗ FALSE	Remove
4	2	4.7	✗ FALSE	Remove
5-9	...	...	✗ FALSE	Remove

-- Get all user 1's listens OR any listen rated > 4.5
SELECT user_id, song_id, rating
FROM Listens
WHERE user_id = 1 OR rating > 4.5

listen_id	user_id	rating	Cond 1: user_id=1	Cond 2: rating>4.5	OR Result	Action
1	1	4.5	✓ TRUE	✗ FALSE	✓ TRUE	Keep
2	1	4.2	✓ TRUE	✗ FALSE	✓ TRUE	Keep
3	1	3.9	✓ TRUE	✗ FALSE	✓ TRUE	Keep
4	2	4.7	✗ FALSE	✓ TRUE	✓ TRUE	Keep
5	2	4.6	✗ FALSE	✓ TRUE	✓ TRUE	Keep
6	2	3.9	✗ FALSE	✗ FALSE	✗ FALSE	Remove
7	3	2.9	✗ FALSE	✗ FALSE	✗ FALSE	Remove
8	3	4.9	✗ FALSE	✓ TRUE	✓ TRUE	Keep
9	3	NULL	✗ FALSE	? UNKNOWN	? UNKNOWN	Remove

SELECT with Calculations

-- Calculate double rating and categorize by quality
SELECT 
    user_id,
    rating,
    rating * 2 AS double_rating,  -- AS means renaming columns
    CASE   -- CASE/WHEN: Creates conditional logic - like if-then-else in SQL
        WHEN rating >= 4.5 THEN 'Excellent'
        WHEN rating >= 4.0 THEN 'Good'
        ELSE 'Average'
    END AS category
FROM Listens
WHERE rating IS NOT NULL

user_id	rating	double_rating	Category	Annotations
1	4.5	9.0	Excellent	4.5 >= 4.5? Yes
1	4.2	8.4	Good	4.2 >= 4.5? No 4.2 >= 4.0? Yes
1	3.9	7.8	Average	3.9 >= 4.5? No 3.9 >= 4.0? No
2	4.7	9.4	Excellent	4.7 >= 4.5? Yes
2	4.6	9.2	Excellent	4.6 >= 4.5? Yes
2	3.9	7.8	Average	3.9 >= 4.5? No 3.9 >= 4.0? No
3	2.9	5.8	Average	2.9 >= 4.5? No 2.9 >= 4.0? No
3	4.9	9.8	Excellent	4.9 >= 4.5? Yes

Common Patterns

Pattern 1: Filter and Transform

-- Round ratings between 4.0 and 5.0
SELECT 
    user_id,
    ROUND(rating, 0) AS rounded_rating
FROM Listens
WHERE rating BETWEEN 4.0 AND 5.0

Step	Operation	Effect
1	FROM Listens	Load 9 rows
2	WHERE BETWEEN 4.0 AND 5.0	Filter to 5 rows
3	SELECT with ROUND	Transform rating values

Understanding Aliases (AS Keyword)

What is an Alias?

An alias creates a new name for a column in your output using the AS keyword:

-- FAILS: Alias 'double_rating' doesn't exist in WHERE
SELECT rating * 2 AS double_rating
FROM Listens
WHERE double_rating > 8  -- ERROR!

-- WORKS: Use original expression in WHERE
SELECT rating * 2 AS double_rating
FROM Listens
WHERE rating * 2 > 8

Why It Matters:

Clause	Execution Order	Can Reference	Cannot Reference
FROM	1st	Table names	SELECT aliases
WHERE	2nd	Table columns	SELECT aliases
SELECT	3rd	Table columns, expressions	Future aliases