Query Equivalence

SQL is a language of options. Most problems can be solved in multiple ways. Let's put these approaches side-by-side and see what shakes out.

Important: Query Equivalence vs Output Equivalence

Query Equivalence

Same logic, different syntax

These queries are semantically identical—they'll produce the same result for ANY valid input data.

WHERE NOT EXISTS (...)
LEFT JOIN + WHERE col IS NULL
→ Always same result!

Output Equivalence

Same result, different logic

Queries produce the same result for THIS data, but not for OTHER data.

WHERE rating > 4.5
WHERE rating >= 4.6
→ Same until we get a new row when someone rates 4.55!
This page shows query equivalence: Different syntactic approaches that are semantically identical.

Example 1: Users Above Their Personal Average

Problem: Find users who have at least one rating above their personal average

Correlated Subquery

CTE + JOIN

Window Function

All produce: Mickey, Minnie, Daffy


Example 2: Users with NO High Ratings (> 4.5)

Problem: Find users who have no ratings greater than 4.5

NOT EXISTS

LEFT JOIN + NULL

NOT IN

All produce: Mickey, Pluto

⚠️ Warning about NOT IN: If the subquery returns ANY NULL values, NOT IN returns no results! Always add "AND column IS NOT NULL" to the subquery. See the NULL in NOT IN Death Trap →

Example 3: Top 3 Songs per User

Problem: Find each user's top 3 highest-rated songs

Window Function (Modern)

Correlated Subquery (Classical)


Example 4: Users Who Listen to Multiple Genres

Problem: Find users who have listened to songs from at least 2 different genres

GROUP BY + HAVING

CTE for Clarity

All produce: Mickey, Minnie, Daffy


When to Use Each Approach?

Approach Best For Avoid When
Correlated Subquery • Conceptual clarity
• Row-specific logic
• Small datasets
• Large datasets
• Complex aggregations
CTEs • Multi-step logic
• Reusable calculations
• Debugging complex queries
• Team readability
• Simple one-off queries
• Older SQL versions (pre-2005)
Window Functions • Rankings (top-N per group)
• Running totals
• Peer comparisons
• Modern SQL environments
• Simple filters
• When debugging needed
• Older SQL versions
LEFT JOIN + NULL • Finding non-matches
• Performance critical
• Complex exclusion logic
• When readability important
• Team unfamiliar with pattern
GROUP BY + HAVING • Simple aggregations
• Clear logic
• Direct approach
• Need intermediate results
• Complex multi-step logic

Remember: Readability of code beats cleverness. Query optimizers optimize for performance.