Window Functions

15 Functions

Window functions perform a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, window functions do not group rows into a single output row; instead, they retain the identity of each row while adding an extra column of calculated data.

1.

ROW_NUMBER()

Description

Assigns a unique, sequential integer to rows within a partition, starting at 1.

Syntax
ROW_NUMBER() OVER (PARTITION BY col ORDER BY col)
When to Use

Creating a simple numbered list of users based on their registration date.

Example Query
SELECT username, ROW_NUMBER() OVER (ORDER BY joined_at) FROM users;
πŸ’‘ Pro Tip

This is the best function for pagination or finding the "first" or "latest" record in a specific group.

2.

RANK()

Description

Assigns a rank to each row within a partition. If values are tied, they get the same rank, and the next rank is skipped.

Syntax
RANK() OVER (ORDER BY col DESC)
When to Use

Ranking users in the Leaderboard. If two users have 1000 XP, both are rank 1, and the next person is rank 3.

Example Query
SELECT username, xp, RANK() OVER (ORDER BY xp DESC) FROM users;
πŸ’‘ Pro Tip

Use RANK() when you want to reflect "gaps" in the standings due to ties.

3.

DENSE_RANK()

Description

Similar to RANK(), but it does not skip ranks after a tie.

Syntax
DENSE_RANK() OVER (ORDER BY col DESC)
When to Use

Ranking users in a league where you want ranks to be continuous (1, 1, 2, 3...).

Example Query
SELECT username, xp, DENSE_RANK() OVER (ORDER BY xp DESC) FROM users;
πŸ’‘ Pro Tip

If you want to know how many "unique" score levels exist above a user, DENSE_RANK() is your tool.

4.

PERCENT_RANK()

Description

Calculates the relative rank of a row as a percentage (from 0 to 1).

Syntax
PERCENT_RANK() OVER (ORDER BY col)
When to Use

Determining which percentile a student's test score falls into.

Example Query
SELECT score, PERCENT_RANK() OVER (ORDER BY score) FROM tests;
πŸ’‘ Pro Tip

The formula used is (rank - 1) / (total_rows - 1). It’s great for creating "Top 10%" badges.

5.

CUME_DIST()

Description

Calculates the cumulative distribution of a value within a set (probability).

Syntax
CUME_DIST() OVER (ORDER BY col)
When to Use

Finding the proportion of users who have a score less than or equal to the current user.

Example Query
SELECT score, CUME_DIST() OVER (ORDER BY score) FROM tests;
πŸ’‘ Pro Tip

Calculates the relative position of a value in a group of values.

6.

NTILE(n)

Description

Divides rows into n roughly equal groups (buckets) and assigns a bucket number.

Syntax
NTILE(groups) OVER (ORDER BY col)
When to Use

Dividing the user base into four "Quartiles" (e.g., Bronze, Silver, Gold, Platinum players).

Example Query
SELECT username, NTILE(4) OVER (ORDER BY xp DESC) AS quartile FROM users;
πŸ’‘ Pro Tip

If the total number of rows isn't perfectly divisible by n, the first few buckets will have one extra row.

7.

LAG()

Description

Accesses data from a previous row in the same result set.

Syntax
LAG(column, offset, default) OVER (...)
When to Use

Comparing a user's current XP gain with their XP gain from the previous day.

Example Query
SELECT day, xp, LAG(xp) OVER (ORDER BY day) AS prev_day_xp FROM user_stats;
πŸ’‘ Pro Tip

This is essential for calculating growth rates or trends over time.

8.

LEAD()

Description

Accesses data from a subsequent (following) row in the same result set.

Syntax
LEAD(column, offset, default) OVER (...)
When to Use

Predicting the "next level" target by looking ahead at the next row in a levels table.

Example Query
SELECT level, xp, LEAD(xp) OVER (ORDER BY level) AS next_level_xp FROM levels;
πŸ’‘ Pro Tip

LEAD() is just the forward-looking version of LAG().

9.

FIRST_VALUE()

Description

Returns the value from the first row in the window frame.

Syntax
FIRST_VALUE(col) OVER (...)
When to Use

Showing the name of the highest-scoring player in every row of a leaderboard report.

Example Query
SELECT username, score, FIRST_VALUE(username) OVER (ORDER BY score DESC) AS champion FROM users;
πŸ’‘ Pro Tip

The ORDER BY clause determines which row is considered 'first'.

10.

LAST_VALUE()

Description

Returns the value from the last row in the window frame.

Syntax
LAST_VALUE(col) OVER (...)
When to Use

Comparing current row to the very last row in the partition.

Example Query
SELECT username, score, LAST_VALUE(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lowest_score FROM users;
πŸ’‘ Pro Tip

Be careful with the default window frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). You usually need to add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to get the true last value of the set.

11.

NTH_VALUE()

Description

Returns the value of the n-th row in the window frame.

Syntax
NTH_VALUE(col, n) OVER (...)
When to Use

Finding the "Runner Up" (second place) for every department in a company report.

Example Query
SELECT dept, NTH_VALUE(employee, 2) OVER (PARTITION BY dept ORDER BY salary DESC) FROM employees;
πŸ’‘ Pro Tip

If the n-th row doesn't exist in the window frame, it returns NULL.

12.

SUM() OVER

Description

Calculates a running total or a partition-wide sum.

Syntax
SUM(col) OVER (ORDER BY col)
When to Use

Calculating a running total of XP earned by a user over many quests.

Example Query
SELECT quest_date, xp, SUM(xp) OVER (ORDER BY quest_date) AS running_total FROM user_quests;
πŸ’‘ Pro Tip

Adding ORDER BY creates a running total; omitting it sums the whole partition.

13.

AVG() OVER

Description

Calculates a moving average or a partition-wide average.

Syntax
AVG(col) OVER (ORDER BY col ROWS BETWEEN ...)
When to Use

Calculating a "7-day moving average" of user activity.

Example Query
SELECT date, active_users, AVG(active_users) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) FROM daily_stats;
πŸ’‘ Pro Tip

Great for smoothing out volatile data trends to see the bigger picture.

14.

COUNT() OVER

Description

Counts rows within a window without collapsing them.

Syntax
COUNT(*) OVER (PARTITION BY col)
When to Use

Showing how many people are in the same League as the current user.

Example Query
SELECT username, league, COUNT(*) OVER (PARTITION BY league) AS league_size FROM users;
πŸ’‘ Pro Tip

Can be used to calculate percentages (e.g., value / COUNT(*) OVER()).

15.

MIN() / MAX() OVER

Description

Finds the minimum or maximum value in a window.

Syntax
MIN(col) OVER (PARTITION BY col)
When to Use

Comparing a user's current score against the all-time high score in their category.

Example Query
SELECT username, score, MAX(score) OVER () AS global_max FROM users;
πŸ’‘ Pro Tip

Useful for finding outliers or comparing individual performance against the best/worst in the group.