Writing Debug Tables: Your Logic Verifier

Core Concept

Debug tables verify your SQL logic - essential for logic debugging and whiteboard interviews.

Two Futures: Debug Tables Save the Day Without Debug Tables 🤖 LLM Query "Syntax looks right!" Ship it! 🚀 💥 Production Disaster ❌ Returns 4 users instead of 1 ❌ Wrong business logic ❌ Angry customers 🔥 3am debugging session With Debug Tables 🤖 Same LLM Query Debug Table Trace Mickey → GROUP ✓ Keep Minnie → GROUP ✗ Filter Daffy → GROUP ✗ Filter Pluto → NULL ✗ Filter 💡 "Wait, this returns 4 not 1!" Fix the logic BEFORE deploy ✅ Correct Query Deployed Happy customers, peaceful nights

Why This Matters

🤖 LLMs and Semantics

  • Syntax: 95%+ accurate
  • Semantics: 16-82% accurate per BIRD benchmarks
  • Same question → 2 different queries → one returns 4 users, other returns 1
  • See LLM Debug

📝 Whiteboard Interviews and Tests

  • Filter out expensive LLM cut-and-pasters
  • "Debug this query" - common question
  • Semantics = what gets you hired
  • See Learning Outcomes

⚡ Actually Faster

  • Trial-and-error debug loop: 30 minutes
  • Drawing one debug table: 5 minutes
  • Have confidence in code: Instant
  • Building intuition: Priceless

See It In Action

Example: Users with Multiple Genres

-- Find users who listen to 2+ songs per genre
SELECT user_id, genre, COUNT(*) AS song_count
FROM Listens l
JOIN Songs s ON l.song_id = s.song_id  
WHERE genre IS NOT NULL
GROUP BY user_id, genre
HAVING COUNT(*) >= 2

Step-by-Step Debug Table:

Step 1: FROM + JOIN (All listen records with song info)

user_idnamesong_idgenrerating
1Mickey1Pop4.5
1Mickey2Pop4.2
1Mickey6Classic3.9
2Minnie2Pop4.7
2Minnie7Classic4.6
2Minnie8Classic3.9
3Daffy1Pop2.9
3Daffy2Pop4.9
3Daffy6ClassicNULL

Step 2: GROUP BY user_id, genre (Create groups)

user_idgenreNotes: songs_in_groupNotes
1Pop2 rows✓ Keep (≥2)
1Classic1 row✗ Filter out
2Pop1 row✗ Filter out
2Classic2 rows✓ Keep (≥2)
3Pop2 rows✓ Keep (≥2)
3Classic1 row✗ Filter out

Step 3: HAVING COUNT(*) >= 2 (Final output)

user_idgenresong_count
1Pop2
2Classic2
3Pop2

Shortcuts for Big Data

Nobody Traces 100,000 Rows!

Instead of This:

user_id | genre | rating
1       | Rock  | 4.5
1       | Rock  | 4.2
1       | Rock  | 3.9
... (497 more Rock rows)
1       | Pop   | 4.8
1       | Pop   | 4.1
... (300 more Pop rows)

Use This Shorthand:

GROUP BY Summary:
User 1, Rock: ~500 rows, AVG ≈ 4.2
User 1, Pop: ~300 rows, AVG ≈ 4.5
User 2, Rock: ~200 rows, AVG ≈ 3.8
...

Edge cases checked:
- MAX(rating) = 5.0 ✓
- MIN(rating) = 1.0 ✓  
- NULL ratings: 42 rows ✓

Simple Notation Tricks

Scenario Full Trace Smart Shorthand
1000 rows after JOIN Show all 1000 "First 3... middle ties... last 2"
GROUP BY with counts List every row "Rock: ~500, Pop: ~300"
Window function ranks Show all ranks "Top 3, any ties, bottom 2"
Complex aggregation Calculate exactly "AVG ≈ 4.2" (close enough!)

What Actually Matters

✓ Quick Verification Checklist

✓ JOINs: Expected row count? (not accidentally cartesian? INNER? LEFT? OUTER?)
✓ NULLs: Handled correctly? (IS NULL vs = NULL?)
✓ Groups: Right granularity? (user vs user+genre?)
✓ Aggregates: Makes sense? (COUNT(*) vs COUNT(column)?)
✓ Windows: Partitions correct? (RANK vs ROW_NUMBER?)

Remember: It's YOUR Tool

🎯 The Bottom Line

  • Tool for correctness, not busy work - skip obvious parts, focus on confusing bits
  • Like multiplication tables in school - use them often early to build intuition
  • You'll naturally start skipping easy steps as your SQL brain develops
  • Use simple conventions that interviewers and graders can follow (for partial credit!)
  • Approximate liberally - "~500 rows" is perfectly fine

Debug tables caught that JOIN bug? Great! Drew them with crayons? Sure. Hieroglyphs only you can understand? Let's reconsider...

For Interviews, PSETs, Tests

❌ Cryptic Hieroglyphs

𓀀 → 𓂋 𓊖 = 2 𓆣
𓁶 → 𓂋 𓄿 = 1 ✗
𓅱 → 𓂋 𓊖 = 2 𓆣
∴ 𓆣 = {𓀀𓊖, 𓅱𓊖}

"Trust me, 𓀀 means Mickey and 𓆣 means keep..."

✓ Clear Convention

User 1, Pop: 2 songs → Keep ✓
User 2, Pop: 1 song → Filter ✗  
User 3, Pop: 2 songs → Keep ✓
Final: {(1,Pop,2), (3,Pop,2)}

Interviewer: "I can follow this!" (+partial credit)