Find Nth maximum value in SQL Server
By Guest Authors on 11 April 2004 | 13 Comments | Tags: SELECT
This aritlce is written by Hariharan Velayuthan. He writes "There are several methods to find out the
Nth maximum/minimum value using SQL. This article discusses on such method to find Nth
maximum value from a desired table. This article is aimed at users who are in the beginner or
intermediate level in SQL Server."
[Note: This article assumes that the reader is familiar with the T-SQL, joins and queries]
I have taken utmost effort to make this article easy to understand, but incase you are not clear with
the concept, please raise up your concern and Ill be more than happy to attend your doubts. All the
examples discussed in this article uses Employee table. If you do not have this table, please use the
following script to create it.
Use Pubs
Go
Create table Employee
(
Eid int,
Name varchar(10),
Salary money
)
Go
Insert into Employee values (1,'harry',3500)
Insert into Employee values (2,'jack',2500)
Insert into Employee values (3,'john',2500)
Insert into Employee values (4,'xavier',5500)
Insert into Employee values (5,'steven',7500)
Insert into Employee values (6,'susana',2400)
Go
A simple query that can find the employee with the maximum salary, would be:
Select * from Employee where salary = (Select max(Salary) from Employee)
How does this query work?
The SQL Engine evaluates the inner most query and then moves to the next level (outer query). So,
in the above example inner query i.e. Select max(Salary) from Employee is evaluated first. This
query will return a value of 7500 (based on the sample data shown as above). This value is
substituted in the outer query and it is evaluated as:
Select * from Employee where salary = (7500)
Returns:
Eid
Name
Salary
steven
7500
If the same syntax is applied to find out the 2nd or 3rd or 4th level of salary, the query would
become bit complex to understand. See the example below:
Select * from Employee where salary =
(Select max(Salary) from Employee where salary
< (Select max(Salary) from Employee where
Salary < (Select max(Salary) from Employee where
Salary < N
The above query would go on and on, depending on the level of salary that is to be determined. As
mentioned earlier, the SQL Engine evaluates the inner most query first and moves the next outer
level. One wouldnt want to write such a big query just to find out this simple information.
The same result can be achieved with a simple syntax and easily understandable logic, by using a
CORRELATED SUBQUERY. This article doesnt explain about correlated sub-query as it is out of scope
of this article. (You may want to take a quick look on CORRELATED SUBQUERY.) As a "Rule of
Thumb" keep these points in mind, when you use a correlated sub-query
1. Correlated sub-query is a performance overhead to the database server and so, you have to
use it only if it is required
2. Avoid using Correlated subquery on large tables, as the inner query is evaluated for each
row of the outer query
Having said that, lets look at the query that captures the Nth maximum value:
Select * From Employee E1 Where
(N-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
E2.Salary > E1.Salary)
(Where N is the level of Salary to be determined)
In the above example, the inner query uses a value of the outer query in its filter condition
meaning; the inner query cannot be evaluated before evaluating the outer query. So each row in the
outer query is evaluated first and the inner query is run for that row. Lets look into the background
process of this query, by substituting a value for N i.e. 4,(Idea is to find the 4th maximum salary):
Select * From Employee E1 Where
(4-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
E2.Salary > E1.Salary)
Since the outer querys value is referred in the inner query, the operation is done row-by-row. Based
on the sample data as shown above, the process starts with the following record:
Employee E1
---------------------------------Eid
Name
Salary
harry
3500
The salary of this record is substituted in the inner query and evaluated as:
Select Count(Distinct(E2.Salary)) From Employee E2
Where E2.Salary > 3500
Above query returns 2 (as there are only 2 salaries greater than 3500). This value is substituted in
the outer query and will be evaluated as:
Select * From Employee E1 Where (4-1) = (2)
The "where" condition evaluates to FALSE and so, this record is NOT fetched in the result.
Next the SQL Engine processes the 2nd record which is:
Employee E1
---------------------------------Eid
Name
Salary
jack
2500
Now the inner query is evaluated as:
Select Count(Distinct(E2.Salary)) From Employee E2
Where E2.Salary > 2500
This query returns a value of 3 (as there are 3 salaries greater than 2500). The value is substituted
in the outer query and evaluated as:
Select * From Employee E1 Where (4-1) = (3)
The "where" condition evaluates to TRUE and so, this record IS fetched in the result. This operation
continues for all the remaining records. Finally the result shows these 2 records:
Eid
Name
Salary
jack
2500
john
2500
The above query works in the same manner in Oracle and Sybase as well. Applying the same logic,
to find out the first maximum salary the query would be:
Select * From Employee E1 Where
(1-1) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
E2.Salary > E1.Salary)
If you are able to understand this functionality, you can workout various other queries in the same
manner. The bottom line is, the query should be efficient and NOT resource hungry.
create database Practice
Create table Employee
(
Eid int,
Name varchar(10),
Salary money
)
Go
Insert
Insert
Insert
Insert
Insert
Insert
Go
into
into
into
into
into
into
Employee
Employee
Employee
Employee
Employee
Employee
values
values
values
values
values
values
(1,'harry',3500)
(2,'jack',2500)
(3,'john',2500)
(4,'xavier',5500)
(5,'steven',7500)
(6,'susana',2400)
select * from Employee
select * from Employee where
select * from Employee where
where
where
where
Salary=(select max(Salary) from Employee)
Salary=(select max(Salary) from Employee
salary < (select max(Salary) from Employee
Salary < (select max(Salary) from Employee
Salary < (select max(Salary) from Employee))))
select * from Employee where Salary=(select max(Salary) from Employee
where salary < (select max(Salary) from
Employee))
Correlated subqueries
In the subqueries previously discussed, SQL evaluates the subquery once, substitutes the result of
the subquery in the search condition, and evaluates the outer-level SELECT based on the value
of the search condition. You can also write a subquery that SQL may need to re-evaluate as it
examines each new row (WHERE clause) or group of rows (HAVING clause) in the outer-level
SELECT. This is called a correlated subquery.
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?
topic=/sqlp/rbafycorrs.htm
Reference Link
http://www.sqlteam.com/article/find-nth-maximum-value-in-sql-server
http://mssqlserver.wordpress.com/2006/11/14/how-do-you-find-thesecond-highest-salary/
http://hemantg.blogspot.com/2007/06/retrieving-nth-salary-in-sqlserver.html