Writing Debug Tables: Your Logic Verifier

Concept. Debug tables — also called handtraces — validate a query's logic by walking the input rows through each clause and writing down what survives at each step. They expose semantic errors that syntax-checking tools miss.

Intuition. When an LLM claims a query against Listens returns 4 users but a quick trace shows 1, build a debug table. List the 9 Listens rows, then for each clause (FROM, WHERE, GROUP BY, HAVING) cross out the rows that drop. The surviving rows are the real answer.

Example. An LLM claims a query returns 4 users; a manual trace through the 9-row Listens table shows just 1 — the debug table exposes the semantic error.

Why This Matters

LLMs and Semantics

Whiteboard Interviews and Tests

  • Filter out expensive LLM cut-and-pasters
  • "Is this query correct?" - common question
  • Logic = what gets you hired
  • See Learning Outcomes

Actually Faster

  • Avoid reprompting loop: 30 minutes
  • Drawing one debug table: 5 minutes
  • Build intuition

Example 1: Users with Multiple Genres

Step-by-Step Debug Table:

Step 1: FROM + JOIN (All listen records with song info)

user_idnamesong_idgenrerating
1Mickey1Pop4.5
1Mickey2Pop4.2
1Mickey6Classic3.9
2Minnie2Pop4.7
2Minnie7Classic4.6
2Minnie8Classic3.9
3Daffy1Pop2.9
3Daffy2Pop4.9
3Daffy6ClassicNULL

Step 2: GROUP BY user_id, genre (Create groups)

user_idgenreNotes: songs_in_groupNotes
1Pop2 rows✓ Keep (≥2)
1Classic1 row✗ Filter out
2Pop1 row✗ Filter out
2Classic2 rows✓ Keep (≥2)
3Pop2 rows✓ Keep (≥2)
3Classic1 row✗ Filter out

Step 3: HAVING COUNT(*) >= 2 (Final output)

user_idgenresong_count
1Pop2
2Classic2
3Pop2

Shortcuts for Big Data

Nobody Traces 100,000 Rows!

Instead of This:

Use This Shorthand:

Simple Notation Tricks

Scenario Full Trace Smart Shorthand
1000 rows after JOIN Show all 1000 "First 3... middle ties... last 2"
GROUP BY with counts List every row "Rock: ~500, Pop: ~300"
Window function ranks Show all ranks "Top 3, any ties, bottom 2"
Complex aggregation Calculate exactly "AVG ≈ 4.2" (close enough!)

Quick Verification Checklist

  • JOINs: Expected row count? (not accidentally cartesian? INNER? LEFT? OUTER?)
  • NULLs: Handled correctly? (IS NULL vs = NULL?)
  • Groups: Right granularity? (user vs user+genre?)
  • Aggregates: Makes sense? (COUNT(*) vs COUNT(column)?)
  • Windows: Partitions correct? (RANK vs ROW_NUMBER?)

Remember: It's YOUR Tool

  • Tool for correctness, not busy work - skip obvious parts, focus on confusing bits
  • Like multiplication tables in school - use them often early to build intuition
  • You'll naturally start skipping easy steps as your SQL brain develops
  • Use simple conventions that interviewers and graders can follow (for partial credit!)
  • Approximate liberally - "~500 rows" is perfectly fine

For Interviews, PSETs, Tests

❌ Cryptic Hieroglyphs

𓀀 → 𓂋 𓊖 = 2 𓆣
𓁶 → 𓂋 𓄿 = 1 ✗
𓅱 → 𓂋 𓊖 = 2 𓆣
∴ 𓆣 = {𓀀𓊖, 𓅱𓊖}

"Trust me, 𓀀 means Mickey and 𓆣 means keep..."

✓ Clear Convention

User 1, Pop: 2 songs → Keep ✓
User 2, Pop: 1 song → Filter ✗  
User 3, Pop: 2 songs → Keep ✓
Final: {(1,Pop,2), (3,Pop,2)}

Interviewer: "I can follow this!" (+partial credit)