SQL Learning Outcomes
Where the Industry is Heading: Your Brains + Bots
🧠 + 🤖 Use AI Copilots
- Engineers ship 10x faster with GPT/Claude
- But companies don't need expensive copy-pasters
- Need engineers who question + leverage what LLMs generate to ship correct logic faster
🧠 Semantic Interviews
50% of data interview questions on SQL logic
- Filter away copy-pasters with whiteboard sessions: write logic, no AI, pure reasoning
- "Debug this query" - right or wrong? Expensive or cheap? How to fix?
☁️ Cloud Scale Reality
- BigQuery, Snowflake, Redshift everywhere
- Massively parallel and Distributed execution across 1000s of machines
- Cost optimization through smart design
What You'll Do in CS145
🧠 Use Your Brain
- Write SQL logic (on paper or whiteboard) without an IDE
- Use Debug Tables to trace semantics
Prep for Whiteboard Interviews: Ace Test 1 on paper (50% SQL)
Example Anecdote 1:
- Project 1: NULL values problem. Had some mismatches because of COUNT(rating), COUNT(*), and count(user_id).
- Traced on paper: 50 NULL ratings. COUNT(column) skips NULLs, COUNT(*) counts all rows.
🤖 Experiment with LLMs
- Explore subqueries, window functions, analysis
- Project 1: Experiment what works; Learn when AI misleads
Prep for Interviews: "What's your AI workflow for data pipelines?"
Example Anecdote 2:
- Project 2: Claude built my 3-level CTE in seconds. But level 2 referenced level 3 (doesn't exist yet). Also its LEFT JOIN + COUNT(*) counted NULLs.
- Fixed: reordered CTEs, used COUNT(t2.id).
- Saved 30 mins writing, took 10 mins debugging.
☁️ Cloud Databases
- Query GBs of data on BigQuery
- Understand parallel execution
- Optimize production queries for cost & speed
Prep for Interviews: Create data portfolio on BigQuery+Colab
Example Anecdote 3:
- Project 1: Query kept timing out, cost $10.50 per run.
- Checked execution plan: scanning 2.1TB before filtering. Moved date filter before JOIN - partition pruning kicked in.
- Now 73GB, $0.37, runs in 12 seconds. Here's my colab on [topic X].
Project 1: Practice All Three Skills
- Build your BigQuery portfolio while mastering brain + AI, and cloud skills
- Real datasets • Production queries • Interview-ready artifacts
SQL Concepts You'll Master
1️⃣ Query Fundamentals
- SELECT-FROM-WHERE logic
- Multi-table JOINs and relationships
- GROUP BY aggregation semantics
- NULL handling mysteries
2️⃣ Intermediate Patterns
- Subqueries with variable scoping
- CTEs for step-by-step queries
- Window functions for analytics
- RANK vs ROW_NUMBER differences
3️⃣ Production Skills
- Query debugging without execution
- Parallel execution patterns
- Cost optimization strategies
- BigQuery at scale
Why these topics?
Exactly what you'll use in summer internships, technical interviews, and your full-time roles.