Window Functions
15 Functions
Window functions perform a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, window functions do not group rows into a single output row; instead, they retain the identity of each row while adding an extra column of calculated data.
Table of Contents
ROW_NUMBER()
Assigns a unique, sequential integer to rows within a partition, starting at 1.
ROW_NUMBER() OVER (PARTITION BY col ORDER BY col)Creating a simple numbered list of users based on their registration date.
SELECT username, ROW_NUMBER() OVER (ORDER BY joined_at) FROM users;This is the best function for pagination or finding the "first" or "latest" record in a specific group.
RANK()
Assigns a rank to each row within a partition. If values are tied, they get the same rank, and the next rank is skipped.
RANK() OVER (ORDER BY col DESC)Ranking users in the Leaderboard. If two users have 1000 XP, both are rank 1, and the next person is rank 3.
SELECT username, xp, RANK() OVER (ORDER BY xp DESC) FROM users;Use RANK() when you want to reflect "gaps" in the standings due to ties.
DENSE_RANK()
Similar to RANK(), but it does not skip ranks after a tie.
DENSE_RANK() OVER (ORDER BY col DESC)Ranking users in a league where you want ranks to be continuous (1, 1, 2, 3...).
SELECT username, xp, DENSE_RANK() OVER (ORDER BY xp DESC) FROM users;If you want to know how many "unique" score levels exist above a user, DENSE_RANK() is your tool.
PERCENT_RANK()
Calculates the relative rank of a row as a percentage (from 0 to 1).
PERCENT_RANK() OVER (ORDER BY col)Determining which percentile a student's test score falls into.
SELECT score, PERCENT_RANK() OVER (ORDER BY score) FROM tests;The formula used is (rank - 1) / (total_rows - 1). Itβs great for creating "Top 10%" badges.
CUME_DIST()
Calculates the cumulative distribution of a value within a set (probability).
CUME_DIST() OVER (ORDER BY col)Finding the proportion of users who have a score less than or equal to the current user.
SELECT score, CUME_DIST() OVER (ORDER BY score) FROM tests;Calculates the relative position of a value in a group of values.
NTILE(n)
Divides rows into n roughly equal groups (buckets) and assigns a bucket number.
NTILE(groups) OVER (ORDER BY col)Dividing the user base into four "Quartiles" (e.g., Bronze, Silver, Gold, Platinum players).
SELECT username, NTILE(4) OVER (ORDER BY xp DESC) AS quartile FROM users;If the total number of rows isn't perfectly divisible by n, the first few buckets will have one extra row.
LAG()
Accesses data from a previous row in the same result set.
LAG(column, offset, default) OVER (...)Comparing a user's current XP gain with their XP gain from the previous day.
SELECT day, xp, LAG(xp) OVER (ORDER BY day) AS prev_day_xp FROM user_stats;This is essential for calculating growth rates or trends over time.
LEAD()
Accesses data from a subsequent (following) row in the same result set.
LEAD(column, offset, default) OVER (...)Predicting the "next level" target by looking ahead at the next row in a levels table.
SELECT level, xp, LEAD(xp) OVER (ORDER BY level) AS next_level_xp FROM levels;LEAD() is just the forward-looking version of LAG().
FIRST_VALUE()
Returns the value from the first row in the window frame.
FIRST_VALUE(col) OVER (...)Showing the name of the highest-scoring player in every row of a leaderboard report.
SELECT username, score, FIRST_VALUE(username) OVER (ORDER BY score DESC) AS champion FROM users;The ORDER BY clause determines which row is considered 'first'.
LAST_VALUE()
Returns the value from the last row in the window frame.
LAST_VALUE(col) OVER (...)Comparing current row to the very last row in the partition.
SELECT username, score, LAST_VALUE(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lowest_score FROM users;Be careful with the default window frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). You usually need to add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to get the true last value of the set.
NTH_VALUE()
Returns the value of the n-th row in the window frame.
NTH_VALUE(col, n) OVER (...)Finding the "Runner Up" (second place) for every department in a company report.
SELECT dept, NTH_VALUE(employee, 2) OVER (PARTITION BY dept ORDER BY salary DESC) FROM employees;If the n-th row doesn't exist in the window frame, it returns NULL.
SUM() OVER
Calculates a running total or a partition-wide sum.
SUM(col) OVER (ORDER BY col)Calculating a running total of XP earned by a user over many quests.
SELECT quest_date, xp, SUM(xp) OVER (ORDER BY quest_date) AS running_total FROM user_quests;Adding ORDER BY creates a running total; omitting it sums the whole partition.
AVG() OVER
Calculates a moving average or a partition-wide average.
AVG(col) OVER (ORDER BY col ROWS BETWEEN ...)Calculating a "7-day moving average" of user activity.
SELECT date, active_users, AVG(active_users) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) FROM daily_stats;Great for smoothing out volatile data trends to see the bigger picture.
COUNT() OVER
Counts rows within a window without collapsing them.
COUNT(*) OVER (PARTITION BY col)Showing how many people are in the same League as the current user.
SELECT username, league, COUNT(*) OVER (PARTITION BY league) AS league_size FROM users;Can be used to calculate percentages (e.g., value / COUNT(*) OVER()).
MIN() / MAX() OVER
Finds the minimum or maximum value in a window.
MIN(col) OVER (PARTITION BY col)Comparing a user's current score against the all-time high score in their category.
SELECT username, score, MAX(score) OVER () AS global_max FROM users;Useful for finding outliers or comparing individual performance against the best/worst in the group.