Colab Practice: Problem Solving with SQL
Step 0: Play with PostgreSQL in Colab
Before you dive into the specific practice problems below, we recommend setting up your own interactive environment.
Open PostgreSQL Colab Playground
Use this notebook to:
-
Set up PostgreSQL: Run the shell commands to install and start a real Postgres instance in the cloud.
-
Explore the Schema: Visualize the
Users,Songs, andListenstables. -
Run Scratchpad Queries: Test your intuition by reading+writing ad-hoc SQL before you attempt the formal exercises.
Practice 1: Writing Debug Tables
Drawing a Debug Table is the single best way to catch logic errors before they hit production. It forces you to think like the database: FROM → JOIN → WHERE → GROUP BY → HAVING → SELECT.
Tip: Check your work: All answers and interactive logic traces are available in the Practice Solutions Colab.
1.1: Filtering Strategies (JOIN vs IN vs EXISTS)
Trace these four queries. Do they all return the same set of genres? Which one is the most efficient for checking existence in a large table?
-- Query 1: INNER JOIN
SELECT DISTINCT s.genre FROM Songs s
INNER JOIN Listens l ON s.song_id = l.song_id;
-- Query 2: INNER JOIN (No Distinct - what happens?)
SELECT s.genre FROM Songs s
INNER JOIN Listens l ON s.song_id = l.song_id;
-- Query 3: IN Clause
SELECT DISTINCT s.genre FROM Songs s
WHERE s.song_id IN (SELECT song_id FROM Listens);
-- Query 4: EXISTS Clause
SELECT DISTINCT s.genre FROM Songs s
WHERE EXISTS (SELECT 1 FROM Listens l WHERE l.song_id = s.song_id);
1.2: Set Operations (UNION vs. EXCEPT)
Mickey (User 1) has both history (Listens) and recommendations. Trace these queries to see how his listening habits overlap with our AI suggestions.
-- Query 5: Mickey's Listening History
SELECT s.genre FROM Songs s INNER JOIN Listens l ON s.song_id = l.song_id WHERE l.user_id = 1;
-- Query 6: Mickey's Recommendations
SELECT s.genre FROM Songs s INNER JOIN Recommendations r ON s.song_id = r.song_id WHERE r.user_id = 1;
-- Query 7: UNION (History OR Recommendations)
SELECT s.genre FROM Songs s INNER JOIN Listens l ON s.song_id = l.song_id WHERE l.user_id = 1
UNION
SELECT s.genre FROM Songs s INNER JOIN Recommendations r ON s.song_id = r.song_id WHERE r.user_id = 1;
-- Query 8: EXCEPT (History BUT NOT recommended)
SELECT s.genre FROM Songs s INNER JOIN Listens l ON s.song_id = l.song_id WHERE l.user_id = 1
EXCEPT
SELECT s.genre FROM Songs s INNER JOIN Recommendations r ON s.song_id = r.song_id WHERE r.user_id = 1;
1.3: The LEFT JOIN
Trace this query for User 1. Does it return more or fewer rows than a regular INNER JOIN? Why?
-- Query 9: All genres, matched against Mickey's listens
SELECT DISTINCT s.genre
FROM Songs s
LEFT JOIN Listens l ON s.song_id = l.song_id AND l.user_id = 1;
The Practice Dataset
For the following advanced challenges, ensure your manual traces account for these new records.
Added Users
| user_id | name | |
|---|---|---|
| 11 | Bob | bob@example.com |
| 12 | Alice | alice@example.com |
| 13 | Eve | eve@example.com |
| 14 | Charlie | charlie@example.com |
| 15 | David | david@example.com |
Added Songs
| song_id | title | artist | genre |
|---|---|---|---|
| 11 | Shake it off | Taylor Swift | Pop |
| 12 | Karma | Taylor Swift | Pop |
| 13 | Anti-hero | Taylor Swift | Rock |
| 14 | Enchanted | Taylor Swift | Pop |
| 15 | Bad blood | Taylor Swift | Rock |
| 16 | Let it be | Beatles | Classic |
| 17 | Here comes the sun | Beatles | Classic |
| 18 | Twist and shout | Beatles | Classic |
Practice 2: Building Intuition with Window Functions
Goal: Let's play with a few queries that "feel" related, so we gain some intuition on how they are different and produce different result sets.
2.1: The Base Count (qry_count_songs)
Find the total number of listens for each song, grouped by genre, song ID, and title. This is your baseline for all ranking queries.
2.2: Ranking by Genre (qry_top_songs_genre)
Find the top-k ranked songs per genre based on the number of listens (try k = 1, 2, and 3).
- How does the "competition" change when you rank within a specific genre vs. the whole dataset?
2.3: Ranking by Artist (qry_top_songs_artist)
Find the top-k ranked songs per artist based on the number of listens.
- If an artist has multiple songs, which one stays on top?
2.4: Finding the Global Peak (qry_max_listens)
Find the single song with the absolute maximum number of listens across the entire platform.
2.5: The Artist Champion (qry_max_listens_artist)
Find the song with the maximum number of listens for each artist.
- This is essentially a "Top-1 per Artist" query. Compare this result to 2.3.
2.6: Multi-Dimensional Ranking (qry_top_songs_genre_artist)
Find the top-k ranked songs per (genre, artist) based on the number of listens.
- Note how the partitions become much smaller. What happens to the rank values?
Challenge: Identical Results?
Run these two queries in the Colab. They produce the exact same result set.
-
Why does Rank=1 per Genre produce the same data as Rank=1 per (Genre, Artist) in our current dataset?
-
Are these queries equivalent?
-- Query A: Rank 1 per Genre
runSql('Query top_songs_genre', qry_top_songs_genre.format(k=1))
-- Query B: Rank 1 per (Genre, Artist)
runSql('Query top_songs_genre_artist', qry_top_songs_genre_artist.format(k=1))
- What data condition is required to make "Top Genre Song" different from "Top Song for that Artist in that Genre"?
Ready to see the results?
All queries, answers, and extended datasets are ready for you in the official Colab.
Open Practice Solutions Colab