WINDOW FUNCTION IN SQL
Ranking Functions
ROW_NUMBER()
RANK()
DENSE_RANK()
Distribution Functions
PERCENT_RANK()
CUME_DIST()
NTILE(n)
Aggregate Functions
SUM()
Rupal Agarwal
AVG():
COUNT():
MIN():
MAX()
Offset Functions
LAG()
LEAD()
FIRST_VALUE()
LAST_VALUE()
WINDOW FUNCTIONS
A window function performs a calculation across a set of table rows that are somehow related
to the current row. This is comparable to the type of calculation that can be done with an
aggregate function. But unlike regular aggregate functions, use of a window function does not
cause rows to become grouped into a single output row — the rows retain their separate
identities.
Let's consider the example of a sales table and show both the input data and the output results
for various window function queries.
1. ROW_NUMBER()
Assigns a unique sequential integer to rows within a partition of a result set, starting with 1 for
the first row in each partition.
SELECT salesperson_id, order_date, revenue,
ROW_NUMBER() OVER (PARTITION BY salesperson_id ORDER BY revenue DESC) AS row_num
FROM sales;
👉Follow Rupal Agarwal for more insightful content!
2. RANK()
Assigns a rank to each row within a partition of a result set. The rank of a row is one plus the
number of ranks that come before it.
SELECT salesperson_id, order_date, revenue,
RANK() OVER (PARTITION BY salesperson_id ORDER BY revenue DESC) AS rank
FROM sales;
3. DENSE_RANK()
Similar to RANK(), but does not skip ranks if there are ties.
SELECT salesperson_id, order_date, revenue,
DENSE_RANK() OVER (PARTITION BY salesperson_id ORDER BY revenue DESC) AS dense_rank
FROM sales;
👉Follow Rupal Agarwal for more insightful content!
4. NTILE()
Divides the rows in an ordered partition into a specified number of approximately equal groups, and
assigns a number to each group.
SELECT salesperson_id,order_date,revenue,
NTILE(3) OVER (ORDER BY revenue DESC) AS ntile
FROM sales;
5. LAG()
Provides access to a row at a given physical offset that comes before the current row.
SELECT salesperson_id, order_date, revenue,
LAG(revenue, 1) OVER (PARTITION BY salesperson_id ORDER BY order_date) AS prev_revenue
FROM sales;
👉Follow Rupal Agarwal for more insightful content!
6. LEAD()
Provides access to a row at a given physical offset that comes after the current row.
SELECT salesperson_id, order_date, revenue,
LEAD(revenue, 1) OVER (PARTITION BY salesperson_id ORDER BY order_date) AS next_revenue
FROM sales;
7. FIRST_VALUE()
Returns the first value in an ordered set of values.
SELECT salesperson_id, order_date, revenue,
FIRST_VALUE(revenue) OVER (PARTITION BY salesperson_id ORDER BY revenue DESC) AS
first_revenue
FROM sales;
👉Follow Rupal Agarwal for more insightful content!
8. LAST_VALUE()
Returns the last value in an ordered set of values.
SELECT salesperson_id, order_date, revenue,
LAST_VALUE(revenue) OVER (PARTITION BY salesperson_id ORDER BY revenue DESC ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_revenue
FROM sales;
9. SUM(), AVG(), MIN(), MAX() as Window Functions
You can also use aggregate functions as window functions to calculate running totals, moving averages,
etc.
SELECT salesperson_id, order_date, revenue,
SUM(revenue) OVER (Partition by salesperson_id) AS total,
AVG(revenue) OVER (Partition by salesperson_id) AS average,
Max(revenue) over(Partition by salesperson_id) AS maximum,
Min(revenue) over(Partition by salesperson_id) AS minimum
from sales;
👉Follow Rupal Agarwal for more insightful content!
Please note: The following SQL query demonstrates the use of different window functions to calculate
various sums:
SELECT salesperson_id, order_date, revenue,
sum(revenue) over (partition by salesperson_id) as sum_partition_by_id,
sum(revenue) over() as total_revenue,
sum(revenue) over(partition by salesperson_id order by revenue desc) as sum_partition_and_orderby
from sales;
SUM(revenue) OVER (PARTITION BY salesperson_id) AS sum_partition_by_id: Calculates the sum of
revenue for each salesperson_id, resulting in the same total for all rows within each partition
(salesperson).
👉Follow Rupal Agarwal for more insightful content!
SUM(revenue) OVER () AS total_revenue: Calculates the total sum of revenue for all rows in the
table, returning the same total for every row.
SUM(revenue) OVER (PARTITION BY salesperson_id ORDER BY revenue DESC) AS
sum_partition_and_orderby: Calculates a running total (cumulative sum) of revenue for each
salesperson_id, ordered by revenue in descending order within each partition.
10. Percent_rank
It calculates the relative rank of a row within a group of rows. This rank is expressed as a percentage
between 0 and 1, representing the position of the row in relation to other rows in the result set.
/* formula=current_row-1/Total_no_of_rows-1 */
Lets Consider the sales table:
SELECT salesperson_id, order_date,revenue,
PERCENT_RANK() OVER (PARTITION BY salesperson_id ORDER BY revenue) AS percent_rank,
round(percent_rank() over(partition by salesperson_id order by revenue)::numeric*100,2) as
percentage
from sales
👉Follow Rupal Agarwal for more insightful content!
11. Cume_dist (Cumulative Distribution)
It calculates the cumulative distribution of a value in a set of values. It returns the proportion of rows
that have a value less than or equal to the current row's value, relative to the total number of rows. The
result is a value between 0 and 1.
/* formula = current_row number/total_no_of_rows */
Lets Consider the same sales table:
👉Follow Rupal Agarwal for more insightful content!
SELECT salesperson_id, order_date,revenue,
cume_dist() OVER (PARTITION BY salesperson_id ORDER BY revenue) AS cumulative_dist,
round(cume_dist() over(partition by salesperson_id order by revenue)::numeric*100,2) as
cumulative_dist_per
from sales
👉Follow Rupal Agarwal for more insightful content!