SQL Learning Outcomes

Where the Industry is Heading: Your Brains + Bots

🧠 + 🤖 Use AI Copilots

  • Engineers ship 10x faster with GPT/Claude
  • But companies don't need expensive copy-pasters
  • Need engineers who question + leverage what LLMs generate to ship correct logic faster

🧠 Semantic Interviews

  • 50% of data interview questions on SQL logic

  • Filter away copy-pasters with whiteboard sessions: write logic, no AI, pure reasoning
  • "Debug this query" - right or wrong? Expensive or cheap? How to fix?

☁️ Cloud Scale Reality

  • BigQuery, Snowflake, Redshift everywhere
  • Massively parallel and Distributed execution across 1000s of machines
  • Cost optimization through smart design

What You'll Do in CS145

🧠 Use Your Brain

  • Write SQL logic (on paper or whiteboard) without an IDE
  • Use Debug Tables to trace semantics

Prep for Whiteboard Interviews: Ace Test 1 on paper (50% SQL)

Example Anecdote 1:

  • Project 1: NULL values problem. Had some mismatches because of COUNT(rating), COUNT(*), and count(user_id).
  • Traced on paper: 50 NULL ratings. COUNT(column) skips NULLs, COUNT(*) counts all rows.

🤖 Experiment with LLMs

  • Explore subqueries, window functions, analysis
  • Project 1: Experiment what works; Learn when AI misleads

Prep for Interviews: "What's your AI workflow for data pipelines?"

Example Anecdote 2:

  • Project 2: Claude built my 3-level CTE in seconds. But level 2 referenced level 3 (doesn't exist yet). Also its LEFT JOIN + COUNT(*) counted NULLs.
  • Fixed: reordered CTEs, used COUNT(t2.id).
  • Saved 30 mins writing, took 10 mins debugging.

☁️ Cloud Databases

  • Query GBs of data on BigQuery
  • Understand parallel execution
  • Optimize production queries for cost & speed

Prep for Interviews: Create data portfolio on BigQuery+Colab

Example Anecdote 3:

  • Project 1: Query kept timing out, cost $10.50 per run.
  • Checked execution plan: scanning 2.1TB before filtering. Moved date filter before JOIN - partition pruning kicked in.
  • Now 73GB, $0.37, runs in 12 seconds. Here's my colab on [topic X].

Project 1: Practice All Three Skills

SQL Concepts You'll Master

1️⃣ Query Fundamentals

  • SELECT-FROM-WHERE logic
  • Multi-table JOINs and relationships
  • GROUP BY aggregation semantics
  • NULL handling mysteries

2️⃣ Intermediate Patterns

  • Subqueries with variable scoping
  • CTEs for step-by-step queries
  • Window functions for analytics
  • RANK vs ROW_NUMBER differences

3️⃣ Production Skills

  • Query debugging without execution
  • Parallel execution patterns
  • Cost optimization strategies
  • BigQuery at scale

Why these topics?

Exactly what you'll use in summer internships, technical interviews, and your full-time roles.