Oracle Analytic Functions
Learning Series
Session 1
Introduction
Enhance the result set
Simple solution
"Reporting" Or "Windowing" Functions”
Unfamiliarity , cryptic syntax & uncertainty
Reinvent
General Syntax
Function(arg1,..., argn)
OVER ( [PARTITION BY <...>]
[ORDER BY <....>]
[<window_clause>]
)
SELECT empno, deptno, hiredate,
ROW_NUMBER( ) OVER
(
PARTITION BY deptno
ORDER BY hiredate NULLS LAST
) SRLNO
FROM emp
WHERE deptno IN (10, 20)
ORDER BY deptno, SRLNO;
How are analytic functions different from group or aggregate functions?
DEPTNO DEPT_COUNT
---------------------- ----------------------
20 5
30 6
EMPNO DEPTNO DEPT_COUNT
---------- ---------- ----------
7369 20 5
7566 20 5
7788 20 5
7902 20 5
7876 20 5
7499 30 6
7900 30 6
7844 30 6
7698 30 6
7654 30 6
7521 30 6
SELECT empno, deptno,
COUNT(*) OVER ( PARTITION BY
deptno ) DEPT_COUNT
FROM emp
WHERE deptno IN (20, 30);
How to break the result set in groups or partitions?
Ans: PARTITION BY
Different analytic functions
SUM, COUNT, AVG, MIN, MAX
LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST
VALUE
How to specify the order of the records in the partition?
"ORDER BY" clause inside the OVER( ) clause
Different from the ORDER BY clause of the main query which comes
after WHERE
ORDER BY <sql_expr> [ASC or DESC] NULLS [FIRST or LAST]
OVER (
PARTITION BY deptno
ORDER BY hiredate
NULLS LAST
)
Today We are Discussing..
ROW_NUMBER, RANK and DENSE_RANK
LEAD and LAG
FIRST VALUE and LAST VALUE function
FIRST and LAST function
ROW_NUMBER( )
SELECT empno, deptno, hiredate,
ROW_NUMBER( ) OVER (PARTITION BY
deptno ORDER BY hiredate
NULLS LAST) SRLNO
FROM emp
WHERE deptno IN (10, 20)
ORDER BY deptno, SRLNO;
EMPNO DEPTNO HIREDATE SRLNO
------ ------ --------- ----------
7782 10 09-JUN-81 1
7839 10 17-NOV-81 2
7934 10 23-JAN-82 3
7369 20 17-DEC-80 1
7566 20 02-APR-81 2
7902 20 03-DEC-81 3
7788 20 09-DEC-82 4
7876 20 12-JAN-83 5
Eg: where different partitions have their own serial numbers
RANK and DENSE_RANK
SELECT empno, deptno, sal,
RANK()
OVER (PARTITION BY deptno
ORDER BY sal DESC NULLS LAST ) RANK,
DENSE_RANK()
OVER (PARTITION BY deptno
ORDER BY sal DESC NULLS LAST ) DENSE_RANK
FROM emp
WHERE deptno IN (10, 20)
ORDER BY 2, RANK;
EMPNO DEPTNO SAL RANK DENSE_RANK
------ ------- ----- ----- ----------
7839 10 5000 1 1
7782 10 2450 2 2
7934 10 1300 3 3
7788 20 3000 1 1
7902 20 3000 1 1
7566 20 2975 3 2
7876 20 1100 4 3
7369 20 800 5 4
Eg: Joint toppers.. RANK skips the next value that is 2
LEAD and LAG
expression on the next rows to current row
LEAD (<sql_expr>, <offset>, <default>) OVER (<analytic_clause>)
<sql_expr> is the expression to compute from the leading row.
<offset> is the index of the leading row relative to the current row, is a positive integer with default 1.
<default> is the value to return if the <offset> points to a row outside the partition range.
SELECT deptno, empno, sal,
LEAD(sal, 1, 0)
OVER (PARTITION BY dept
ORDER BY sal DESC NULLS LAST) NEXT_LOWER_SAL,
LAG(sal, 1, 0)
OVER (PARTITION BY dept
ORDER BY sal DESC NULLS LAST) PREV_HIGHER_SAL
FROM emp
WHERE deptno IN (10, 20)
ORDER BY deptno, sal DESC;
LEAD and LAG(cont…)
DEPTNO EMPNO SAL NEXT_LOWER_SAL PREV_HIGHER_SAL
------- ------ ----- -------------- ---------------
10 7839 5000 2450 0
10 7782 2450 1300 5000
10 7934 1300 0 2450
20 7788 3000 3000 0
20 7902 3000 2975 3000
20 7566 2975 1100 3000
20 7876 1100 800 2975
20 7369 800 0 1100
FIRST VALUE and LAST VALUE function
FIRST_VALUE(<sql_expr>) OVER (<analytic_clause>)
Picks the first record from the partition after doing the ORDER BY
How many days after the first hire of each department were the next employees hired?
SELECT empno, deptno,
hiredate - FIRST_VALUE(hiredate)
OVER ( PARTITION BY deptno
ORDER BY hiredate ) DAY_GAP
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, DAY_GAP;
EMPNO DEPTNO DAY_GAP
---------- ---------- ----------
7369 20 0
7566 20 106
7902 20 351
7788 20 722
7876 20 756
7499 30 0
7521 30 2
7698 30 70
7844 30 200
7654 30 220
7900 30 286
FIRST and LAST function
Function( ) KEEP (DENSE_RANK FIRST ORDER BY <expr>) OVER <partitioning_clause>)
Deviation in syntax
do not have the ORDER BY inside the OVER clause
always DENSE_RANK
How each employee's salary compare with the average salary of the first year hires of their department?
SELECT empno, deptno, TO_CHAR(hiredate,'YYYY') HIRE_YR, sal,
TRUNC(
AVG(sal) KEEP (DENSE_RANK FIRST
ORDER BY TO_CHAR(hiredate,'YYYY'))
OVER (PARTITION BY deptno)
) AVG_SAL_YR1_HIRE
FROM emp
WHERE deptno IN (20, 10)
ORDER BY deptno, empno, HIRE_YR;
EMPNO DEPTNO HIRE SAL AVG_SAL_YR1_HIRE
--------- --------- ------ ------- --------
7782 10 1981 2450 3725
7839 10 1981 5000 3725
7934 10 1982 1300 3725
7369 20 1980 800 800
7566 20 1981 2975 800
7788 20 1982 3000 800
7876 20 1983 1100 800
7902 20 1981 3000 800
Summary
ROW_NUMBER, RANK and DENSE_RANK
Function() OVER ( [PARTITION BY <...>] [ORDER BY <....>]
LEAD and LAG
LEAD (<sql_expr>, <offset>, <default>) OVER (<analytic_clause>)
FIRST VALUE and LAST VALUE function
FIRST_VALUE(<sql_expr>) OVER (<analytic_clause>)
FIRST and LAST function
Function( )
KEEP (DENSE_RANK FIRST ORDER BY <expr>)
OVER <partitioning_clause>)
Speaking of Advantages
Can be achieved using sub-queries and joins.
But:
Rewrite on every new requirement
Testing & Debug
Tune & Optimise(difference)
Faster(design-a)
Thank you!