Window Functions: Calculations Without Collapsing

Core Concept

Window Functions are GROUP BY's more powerful sibling, designed for those who want to keep all their data intact while still performing complex calculations. Imagine ranking songs per user without losing any rows. Using RANK() OVER PARTITION BY user_id, you keep all nine rows and simply add ranks 1 through 3.


GROUP BY vs PARTITION BY: The Key Difference

GROUP BY: Collapses Rows

GROUP BY is the hammer that turns your detailed data into summarized chunks. It’s efficient, but it also means you lose granularity. Here’s what happens:

user_idavg_ratingNotes
14.2AVG(4.5, 4.2, 3.9) = 12.6/3
24.4AVG(4.7, 4.6, 3.9) = 13.2/3
33.9AVG(2.9, 4.9, NULL) = 7.8/2 (NULL ignored)

Result: 9 rows → 3 rows

PARTITION BY: Keep All Rows

PARTITION BY is the scalpel that lets you slice your data into partitions while keeping every detail intact. You get the best of both worlds: detailed data with the power of aggregation.

user_idsong_idratinguser_avgNotes
114.54.2Partition 1: AVG(4.5, 4.2, 3.9)
124.24.2Partition 1: AVG(4.5, 4.2, 3.9)
163.94.2Partition 1: AVG(4.5, 4.2, 3.9)
224.74.4Partition 2: AVG(4.7, 4.6, 3.9)
274.64.4Partition 2: AVG(4.7, 4.6, 3.9)
283.94.4Partition 2: AVG(4.7, 4.6, 3.9)
312.93.9Partition 3: AVG(2.9, 4.9, NULL) = 3.9
324.93.9Partition 3: AVG(2.9, 4.9, NULL) = 3.9
36NULL3.9Partition 3: AVG ignores NULL

Result: 9 rows → 9 rows (all kept)


Ranking Functions: ROW_NUMBER, RANK, DENSE_RANK

Window functions let you rank data without collapsing it. Here's how it works:

Step 1: Partition by user_id

user_idsong_idratingNotes
114.5Partition 1
124.2Partition 1
163.9Partition 1
224.7Partition 2
274.6Partition 2
283.9Partition 2
312.9Partition 3
324.9Partition 3
36NULLPartition 3

Step 2: Order by rating DESC within each partition

user_idsong_idratingNotes
114.5Highest in partition 1
124.2Second in partition 1
163.9Third in partition 1
224.7Highest in partition 2
274.6Second in partition 2
283.9Third in partition 2
324.9Highest in partition 3
312.9Second in partition 3
36NULLNULL sorts last with DESC

Step 3: Apply ranking functions

user_idsong_idratingrow_numrankdense_rankNotes
114.5111Highest
124.2222Second
163.9333Third
224.7111Highest
274.6222Second
283.9333Third
324.9111Highest
312.9222Second
36NULL333NULL gets rank 3

Key Differences:


Key Rules

  1. No Row Reduction: Unlike GROUP BY, all rows survive.

  2. PARTITION BY: Defines groups (optional - omit for whole table).

  3. ORDER BY: Sets sequence within partitions (required for some functions).

  4. NULL Handling: NULLs group together in PARTITION BY, sort first/last in ORDER BY.


Common Patterns

Common Mistakes

The Missing PARTITION BY Bug

Without PARTITION BY, you get a global ranking, which might not be what you intended.

Wrong ORDER BY Direction

ASC vs DESC changes the ranking meaning, so double-check your order.

Forgetting NULLs in ORDER BY

NULL sorting varies by database (PostgreSQL: NULLs first, MySQL: NULLs last), which can lead to unexpected results.