KEMBAR78
Analytic Functions in Oracle | PDF | Business | Sql
0% found this document useful (0 votes)
733 views14 pages

Analytic Functions in Oracle

Analytic functions allow computations across sets of rows that are grouped using a window or partition. They return multiple rows per group. The window determines the range of rows used in calculations for each row. Analytic functions are computed after joins, where clauses, groups and having clauses. They are commonly used to calculate cumulative, moving and centered aggregates. Examples provided demonstrate using analytic functions like rank, dense_rank, row_number, lead, lag, first_value and last_value.

Uploaded by

Vasant Hello
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
733 views14 pages

Analytic Functions in Oracle

Analytic functions allow computations across sets of rows that are grouped using a window or partition. They return multiple rows per group. The window determines the range of rows used in calculations for each row. Analytic functions are computed after joins, where clauses, groups and having clauses. They are commonly used to calculate cumulative, moving and centered aggregates. Examples provided demonstrate using analytic functions like rank, dense_rank, row_number, lead, lag, first_value and last_value.

Uploaded by

Vasant Hello
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Analytic Functions in Oracle

By,
Ravi.G
Analytic functions compute an aggregate value based on a group of rows.

They differ from aggregate functions in that they return multiple rows for each group.

The group of rows is called a window and is defined by the analytic clause. For each row, a sliding window of rows is
defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be
based on either a physical number of rows or a logical interval such as time.

Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and
all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore,
analytic functions can appear only in the select list or ORDER BY clause.

Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.

 Data will stay in the database

 No extra time spent on data extraction

 No need of flat files

 Better data security since data stay in the database

 Similar syntax to traditional SQL functions

 Easy to use

 Efficient: removes a lot of procedural code and complex or inefficient queries that would have taken a long tome to
develop, to achieve the same result
Difference between Analytic and SQL Functions
Difference 1: For SQL functions, non-”group by” column is not allowed in the select clause
• The following query will not work:
select dept,name, count(1) from emp where dept in ('128','130') group by dept order by dept
• Analytic function allows the non-"group by" column to present in select clause
select dept,name, count(1) over(partition by dept) empNum from emp where dept in ('128','130') order
by dept
• Analytic functions calculate the aggregation without grouping rows

Difference 2: Analytic function can only appear in the SELECT clause and in the main ORDER BY clause of a query,
because analytic functions are computed after all JOIN, WHERE, GROUP BY and HAVING are computed on the
query

-- correct query
select dept,count(1) from emp where dept in (166,168) group by dept having count(1)>20

-- wrong query
select dept,count(1) over() from emp where dept in (166,168) group by dept having count(1) over()>20
List of Analytic-Functions

AVG, CORR, COVAR_POP, COVAR_SAMP, COUNT, CUME_DIST, DENSE_RANK, FIRST, FIRST_VALUE,


LAG, LAST, LAST_VALUE, LEAD, MAX, MIN, NTILE, PERCENT_RANK, PERCENTILE_CONT,
PERCENTILE_DISC, RANK, RATIO_TO_REPORT, STDDEV, STDDEV_POP, STDDEV_SAMP, SUM,
VAR_POP, VAR_SAMP, VARIANCE.CUBE,ROLLUP,LEG,LEAD,ROW_NUMBER etc..
Syntax

Function(arg1,..., argn) OVER ( [PARTITION BY <...>] [ORDER BY <....>] [<window_clause>] )

• Argument: 0 -3 arguments
• Query-partition-clause: break result set into groups
• Order-by-clause: specifies how the data is sorted within each group
• Windowing-clause: define a sliding or anchored window of data within a group

 Partition clause breaks a result set into groups


 If no PARTITION inside the OVER( ) portion, the analytic function acts on the entire result set returned by the where clause.

-- the following query does not partition the result set


select dept,count(*) over () total_emp from emp where dept between '100' and
'300' order by dept;

-- partition the result set based on dept

select dept,count(*) over (partition by dept) from emp where dept between '100' and '300' order by dept;
1. SELECT *FROM EMP
2. SELECT DEPTNO,COUNT(*) FROM EMP GROUP BY DEPTNO

3. SELECT DEPTNO,COUNT(*) OVER (PARTITION BY DEPTNO)


DEPT_COUNT FROM EMP

Finding the Oracle build-in functions

select distinct object_name from all_arguments where package_name = 'STANDARD' and object_name like
'CUBE';
Analytic functions for descriptive statistics include the following : Maximum, Minimum, Average, Median, Count
The functions SUM, COUNT, AVG, MIN, MAX are the common analytic functions the result of which does not
depend on the order of the records.
SUM() Function : MAX() Function :
SELECT DEPTNO,SUM(SAL) OVER (PARTITION BY SELECT DEPTNO,MAX(SAL) OVER (PARTITION BY DEPTNO)
DEPTNO) DEPT_SUM FROM EMP DEPT_MAX FROM EMP

AVG() Function : MIN() Function :


SELECT DEPTNO,AVG(SAL) OVER (PARTITION BY
SELECT DEPTNO,MIN(SAL) OVER (PARTITION BY DEPTNO)
DEPTNO) DEPT_AVG FROM EMP
DEPT_MIN FROM EMP
Row_Number()Function:
ROW_NUMBER( ) gives a running serial number to a partition of records. It is very useful in reporting, especially in
places where different partitions have their own serial numbers.

Note :Order by clause is must define for ROW_NUMBER() function.


SELECT DEPTNO,EMPNO,ROW_NUMBER() OVER (PARTITION BY DEPTNO ORDER BY SAL) CUST_ROW_NUM
FROM EMP

Top-N Query using Analytic Function


SELECT deptno,EMPNO,SAL FROM (SELECT
SELECT deptno,EMPNO,SAL FROM (SELECT deptno,EMPNO,sal,ROW_NUMBER() OVER (PARTITION BY DEPTNO ORDE
deptno,EMPNO,sal,ROW_NUMBER() OVER (PARTITION BY DEPTNO BY SAL DESC ) CUS_NUM FROM EMP where deptno in (10,20)) WHERE
ORDER BY SAL DESC ) CUS_NUM FROM EMP) WHERE CUS_NUM=2 CUS_NUM=2
RANK and DENSE RANK Functions
RANK and DENSE_RANK both provide rank to the records based on some column value or expression. In case of a tie of 2 records at
position N, RANK declares 2 positions N and skips position N+1 and gives position N+2 to the next record. While DENSE_RANK
declares 2 positions N but does not skip position N+1.

SELECT EMPNO,DEPTNO,SAL,RANK() OVER (PARTITION BY DEPTNO ORDER BY SAL DESC) RANK,DENSE_RANK() OVER
(PARTITION BY DEPTNO ORDER BY SAL DESC) DENSE_RANK FROM EMP
LEAD and LAG
LEAD has the ability to compute an expression on the next rows (rows which are going to come after the current row)
Functions
and return the value to the current row. The general syntax of LEAD is shown below:

LEAD (<sql_expr>, <offset>, <default>) OVER (<analytic_clause>)


<sql_expr> is the expression to compute from the leading row.
<offset> is the index of the leading row relative to the current row.
<offset> is a positive integer with default 1.
<default> is the value to return if the <offset> points to a row outside the partition range.
The syntax of LAG is similar except that the offset for LAG goes into the previous rows.

SELECT DEPTNO, EMPNO, SAL,


LEAD(SAL, 1, 0) OVER (PARTITION BY DEPTNO ORDER BY SAL DESC NULLS LAST) NEXT_LOWER_SAL,
LAG(SAL, 1, 0) OVER (PARTITION BY DEPTNO ORDER BY SAL DESC NULLS LAST) PREV_HIGHER_SAL
FROM EMP
WHERE DEPTNO IN (10, 20)
ORDER BY DEPTNO, SAL DESC;
FIRST VALUE and LAST VALUE functions
The general syntax is:
FIRST_VALUE(<sql_expr>) OVER (<analytic_clause>)
The FIRST_VALUE analytic function picks the first record from the partition after doing the ORDER BY. The
<sql_expr> is computed on the columns of this first record and results are returned. The LAST_VALUE function is
used in similar context except that it acts on the last record of the partition.
-- How many days after the first hire of each department
were the next -- employees hired?

SELECT EMPNO, DEPTNO, HIREDATE - FIRST_VALUE(HIREDATE)


OVER (PARTITION BY DEPTNO ORDER BY HIREDATE) DAY_GAP
FROM EMP
WHERE DEPTNO IN (20, 30)
ORDER BY DEPTNO, DAY_GAP;
FIRST and LAST functions
The FIRST function (or more properly KEEP FIRST function) is used in a very special situation. Suppose we rank a group of record and found several
records in the first rank. Now we want to apply an aggregate function on the records of the first rank. KEEP FIRST enables that.

The general syntax is:

Function( ) KEEP (DENSE_RANK FIRST ORDER BY <expr>) OVER (<partitioning_clause>)

-- How each employee's salary compare with the average salary of the first -- year
hires of their department?

SELECT empno, deptno, TO_CHAR(hiredate,'YYYY') HIRE_YR, sal,


TRUNC(
AVG(sal) KEEP (DENSE_RANK FIRST
ORDER BY TO_CHAR(hiredate,'YYYY') )
OVER (PARTITION BY deptno)
) AVG_SAL_YR1_HIRE
FROM emp
WHERE deptno IN (20, 10)
ORDER BY deptno, empno, HIRE_YR;
Cube function Rollup function
select deptno,job,count(*) from emp group by select deptno,job,count(*) from emp group by
cube(deptno,job) rollup(deptno,job)
Order of computation and performance tips

Defining the PARTITOIN BY and ORDER BY clauses on indexed columns (ordered in accordance with the
PARTITION CLAUSE and then the ORDER BY clause in analytic function) will provide optimum performance.

Even in absence of indexes analytic functions provide acceptable performance but need to do sorting for computing
partition and order by clause.

If the query contains multiple analytic functions, sorting and partitioning on two different columns should be avoided
if they are both not indexed.

C:\Documents and
Settings\ravi.g\Desktop

Queries
THANKS

You might also like