SQL – Part I
SQL
• Structured Query Language is a standard relational database language.
• Originally developed by IBM and later became official standard- ANSI-SQL.
• SQL language can be split into:
DML: Querying and manipulating data
DDL: Creation and deletion of schemas
Triggers: Actions executed based on some conditions
Embedded SQL
Security and Transaction management
3
SQL Queries
• Queries fetch the information from the database tables based on the
condition specified in the SQL.
• We shall work with the relations given below
Student (sid: integer, sname : string, age: integer)
Course (cid: integer, cname: string, duration: integer)
Enrollment (sid: integer, cid: integer, edate: date)
4
Student
Sample Data
SID SNAME AGE
----------------------------------------
1 Mary Molle 20
2 John July 21
3 Lilly Lolly 20
4 Bill Will 22
Enrollment
SID CID EDATE
------------------------------
Course 1 2 10-JUL-06
CID CNAME DURATION 2 2 10-JUL-06
---------------------------------------- 3 1 01-SEP-06
1 Java 1.5 52 4 1 01-APR-06
2 RDBMS 30 1 1 01-SEP-06
3 DS 35
5
BASICS
6
Syntax
• Syntax of SELECT clause
• SELECT field1 [,field2,…,fieldn]
• FROM table1 [,table2,…, tablem]
• [WHERE conditions]
• [GROUP BY field1 [,field2,…,fieldk]]
• [HAVING condition]
• [ORDER BY field1 [,field2,…,fieldl]]
• SQL is not case sensitive. By convention the keywords are in
upper case.
7
Basic select statement
• Query 1: Find the ids and names of all students .
SELECT sid, sname FROM Student
• Query 2: Get all the details of the courses whose duration is more than
50 hours.
SELECT * FROM Course WHERE duration >50
• Query 3: Find sid of Bill Will.
SELECT sid FROM Student WHERE sname=‘Bill Will’
• Query 4: For number of 7-hour day it will take to finish the individual
courses.
SELECT cid,duration/7 FROM Course
8
Simple Join
•Query 5: Find sids and names of the students have enrolled for at
least one course.
SELECT Student.sid, sname FROM Student, Enrollment
WHERE Student.sid= Enrollment.sid
Join condition
• Using alias
SELECT s.sid, sname FROM Student s, Enrollment e
WHERE s.sid= e.sid
9
DISTINCT/UNIQUE
• DISTINCT is used to eliminate duplicate rows.
• Query 5 Refined: Find sids and names of the students have
enrolled for at least one course.
SELECT DISTINCT s.sid, sname FROM Student s,
Enrollment e WHERE s.sid= e.sid
Or
SELECT UNIQUE s.sid, sname FROM Student s,
Enrollment e WHERE s.sid= e.sid
10
Joining three tables
• Query 6: Find the names of the students and the names of the
course which they have enrolled for.
SELECT DISTINCT sname,cname FROM Student s,
Enrollment e, Course c WHERE s.sid= e.sid and
c.cid=e.cid
2 join conditions
11
Patterns, Ranges & Logical
Operators
12
Patterns: LIKE
•LIKE operator is used with wild card characters to search for patterns.
• Wild cards:
% : any number of characters
_ : one character
•Query 7: Find all the students who name end with the letter ‘y’.
SELECT sname FROM student WHERE sname LIKE '%y'
•Query 8: Find all the enrollments that happened in july.
select * from enrollment where edate like '%JUL%'
13
More LIKE Queries
• Query 9: Find the 2 lettered course name that ends with a ‘S’
SELECT * FROM course WHERE cname LIKE '_S'
• Query 10: Find all the students whose name as two characters
before ‘ll’.
SELECT sname FROM student WHERE sname LIKE '__ll%'
14
Ranges: BETWEEN
• Allows querying within ranges.
• Query 11: Find all enrollments which has taken place from July till
September.
SELECT * FROM enrollment WHERE edate BETWEEN '01-
JUL-2006' AND '30-SEP-2006'
• Query 12: Find all courses in the range of 30 to 35 hours.
SELECT * FROM course WHERE duration BETWEEN 30 AND
35
15
Logical Operators
• AND , OR , NOT
• Query 13: Find all courses not in the range of 30 to 35 hours.
SELECT * FROM course WHERE duration NOT BETWEEN 30
AND 35
• Query 14: Find names of all students who have enrolled for some
courses and whose age is 20.
SELECT sname FROM student s, enrollment e WHERE
s.sid=e.sid AND s.age=20
16
Set Manipulation Queries and
Nested Queries
17
Set Manipulation Constructs
• UNION
• INTERSECT eliminates the
• EXCEPT (OR MINUS) duplicate rows
• UNION ALL from the final set.
• IN
• Op ANY and Op ALL
Nested and Correlated
• EXISTS
Subqueries
18
UNION
• Query 15: Find the sids of all the students who have taken Java or
RDBMS courses.
(SELECT sid FROM ENROLLMENT e, COURSE c WHERE
CNAME LIKE 'Java%' AND e.cid=c.cid)
UNION
(SELECT sid FROM ENROLLMENT e, COURSE c WHERE
CNAME LIKE 'RDBMS%' AND e.cid=c.cid)
Use UNION ALL if duplicates are to be reatined
19
INTERSECT
• Query 16: Find the sids of all the students who have taken both
Java and RDBMS courses.
(SELECT sid FROM ENROLLMENT e, COURSE
c WHERE CNAME LIKE 'Java%' AND
e.cid=c.cid)
INTERSECT
(SELECT sid FROM ENROLLMENT e, COURSE
c WHERE CNAME LIKE 'RDBMS%' AND
e.cid=c.cid)
20
Would this work?
• Query 16 revisit: For the same question in the previous slide, would
the query below work?
SELECT sid FROM ENROLLMENT e, COURSE c WHERE
CNAME LIKE 'Java%' AND CNAME LIKE 'RDBMS%' AND
e.cid=c.cid
21
EXCEPT or MINUS
• Query 17: Find the sids of all the students who have taken Java
but not RDBMS courses.
(SELECT sid FROM ENROLLMENT e, COURSE c WHERE
CNAME LIKE 'Java%' AND e.cid=c.cid)
MINUS
(SELECT sid FROM ENROLLMENT e, COURSE c WHERE
CNAME LIKE 'RDBMS%' AND e.cid=c.cid)
Set Difference
22
Nested Query
• A nested query is a query which has another query embedded in it.
This embedded query is called subquery.
• The result of the top-level query depends on the result of the sub-
queries.
• For each row of the top-level query, subquery is evaluated.
23
Set-Comparison
• Query 18: Find all the course whose duration is more any other
course.
SELECT cname FROM course WHERE duration >=
ALL(SELECT duration FROM course )
• Query 19: Find all the students whose age is below the age of any of
the students whose name end with ‘y’.
SELECT * FROM STUDENT WHERE age < ANY (SELECT age
FROM STUDENT WHERE sname LIKE '%y')
24
IN
• The IN operator allows us to test whether a given value is in a given
set of elements.
• Query 20: Find the names of all students who have taken Java
course.
SELECT sname FROM Student s where sid IN (SELECT
sid FROM Enrollment e, Course c WHERE cname LIKE
'Java%' AND e.cid=c.cid)
25
More examples for IN
• Query 21: Get all the student’s name whose sids are 2 and 4.
SELECT sname FROM Student WHERE sid IN(2,4)
• Query 22: Get all the student’s name whose sids are not in 2 or 4.
SELECT sname FROM Student WHERE sid NOT IN(2,4)
26
Correlated subqueries
• In the nested subqueries the inner query’s result is passed to outer
query and the outer query executes.
• In cases of correlated subqueries, the inner query is executed once for
each row in the outer query.
• Query 23: Find the names of the students who have enrolled for a
course on 1 September 2006.
SELECT sname FROM Student s where s.sid IN (SELECT
e.sid FROM Enrollment e WHERE e.sid=s.sid AND
e.edate='1-SEP-2006')
27
EXISTS
• EXISTS is same like IN operator with the difference that the top-level
query executes based on the existence of a record of the subquery.
• Query 23 revisited: Find the names of the students who have
enrolled for a course on 1 September 2006.
SELECT sname FROM Student s where EXISTS
Same as (SELECT
the previous query *
FROM Enrollment e WHERE e.sid=s.sid AND e.edate='1-
SEP-2006')
28
DIVISION
• Division operation finds the subset of items in one set that are
related to all items in another set.
• There is no straight division operator as such in SQL.
• But the there is a way in which such operation are performed using
correlated subqueries.
• Lets us look at the example to understand the strategy.
29
Division by example
• Find the students who have taken all the courses.
SELECT sid FROM Student s WHERE not exists (
SELECT cid FROM course
MINUS
SELECT cid FROM enrollment e WHERE e.sid=s.sid)
A
B Analysis…..
30
SID SNAME CID CNAME SID CID
1 Bill Will 1 Java 1 1
2 Nora M 2 RDBMS 2 1
1 2
Round 1: sid =1 .
A: SELECT cid FROM course : (1,2)
B: SELECT cid FROM enrollment e WHERE
e.sid=s.sid : (1,2)
A-B=().
Outer query selects sid=1
31
Round 1: sid =2 .
A: SELECT cid FROM course : (1,2)
B: SELECT cid FROM enrollment e WHERE
e.sid=s.sid : (1)
A-B=(2). Since A-B is not empty,
outer query does not return anything.
32
Aggregate Operators
33
Aggregate Operators
• COUNT([DISTINCT] a)
• SUM([DISTINCT] a)
• AVG([DISTINCT] a)
• MAX(a)
• MIN(a)
• where ‘a’ is an attribute of the table.
• Note that aggregate values are not allowed in the WHERE
clause.
34
COUNT
• Query 24: Find the total number of courses.
SELECT count(*) FROM Course
• Query 25: Find the total number of unique students who have
enrolled for a course.
SELECT count(distinct sid) FROM Enrollment
35
MAX, MIN
• Query 26: What is the age of the youngest student?
SELECT MIN(age) FROM student
• Query 18 revisit: Find all the courses whose duration is more than
any other course duration.
SELECT cname FROM Course WHERE duration =(SELECT
MAX(duration) FROM Course)
36
SUM, AVG
• Query 27: How many hours at the minimum would it take for
anybody to finish if this person takes all the courses?
SELECT SUM(duration) FROM Course
• Query 28: What is the average age of a student?
SELECT AVG(age) FROM Student
37
Challenge Questions
• Query 29: What is/are name(s) of the youngest student(s)
enrolled for Java course?
select sname from Student where
age=(select min(age) from Student s,
Enrollment e, Course c where
s.sid=e.sid and e.cid=c.cid and
cname like 'Java%')
38
Challenge Questions
• Query 30: Find the names of the students whose age is next to
the oldest student(s).
select sname,age from student s
where 1=(select count(DISTINCT
age) from student s1 where
s.age<s1.age)
39
Order By
• ORDER BY causes the result of the query to be displayed in a specific
order (ascending or descending) based on the attribute.
• Display the student records in sorted order by their names.
SELECT * FROM student ORDER BY sname ASC
• Display the course records displaying from highest to lowest w.r.t.
course duration and then sorted by course name.
SELECT * FROM course ORDER BY duration DESC, cname
By default it is always ASC
40
GROUP BY and HAVING clauses
41
GROUP BY
• GROUP BY allows us to query based on groups of rows which
have same value for some attribute(s).
• Query 31: Find the number of students enrolled for each course.
SELECT cid, count(*) FROM enrollment GROUP BY cid
CID COUNT(*)
--------------------------------
1 3
2 2
42
Challenge Question
• Query 32: Display the course name and number of students
enrolled for each course.
select cname, count(*) from course
c,enrollment e where c.cid=e.cid
group by cname
43
HAVING
• HAVING clause is used to query the result of the row that is
generated by a GROUP BY clause.
• Query 33: Display the course name and number of students enrolled
for each course for courses which have more than 2 students.
SELECT cname, count(*) FROM course c,enrollment e
WHERE c.cid=e.cid GROUP BY cname HAVING count(*)>2
44
Challenge Question
• Find all the sids who have taken a course which sid=1 has
taken.
select sid from enrollment where cid
in(select cid from enrollment where
sid=1) group by sid having
count(*)=(select count(*) from
enrollment where sid=1)
45
NULL
• A column with an unknown value has the value NULL.
• A null value is not < or > or = to anything. Any arithmetic operation
on null value yields null. But with aggregate functions this not the
case.
• To compare with NULL value the comparison operator IS NULL or
IS NOT NULL is used.
• Query 34: Find all the students whose ages have not been given.
SELECT * FROM student WHERE age IS NULL
46
JOINS
47
Joins
• Joining is a technique using in which two or more tables are
combined to generate results.
• Cartesian or Cross Join
• Equality Join :
Natural Join
Inner Join
Join USING…
Outer Join
48
Cartesian or Cross Join
• Result of such joining yields the cross product of combining each
row of the table with the other.
• Query 35: Find all the records of all the students who have taken
courses.
SELECT * FROM Student, Course
49
Natural Join
• Join based on equality of some common fields in the related tables.
• Query 5 Again: Find sids and names of the students who have enrolled
for at least one course.
SELECT DISTINCT Student.sid, sname FROM
Student, Enrollment WHERE Student.sid=
Enrollment.sid
Or
SELECT DISTINCT sid, sname FROM Student
NATURAL JOIN Enrollment
50
USING and ON
• In cases where there are more than one common key, if we need
to join based on only one key the USING keyword can be used.
SELECT DISTINCT sid, sname FROM Student JOIN
Enrollment USING(sid)
Or
SELECT DISTINCT Student.sid, sname FROM Student
JOIN Enrollment ON Student.sid= Enrollment.sid
51
Outer Join- Left
• In the outer join, the records that do not match the criteria also
appear.
• Query 36: List out all the students. Also list out the courses, if any,
that each student has enrolled for.
SELECT * FROM student,enrollment WHERE
student.sid=enrollment.sid(+)
Or
select * from student LEFT OUTER JOIN enrollment
on student.sid=enrollment.sid
52
Right Outer Join
• Query 37 : List out all the courses. Also list out the students, if any,
who have enrolled for each of the courses.
SELECT c.cid, cname, sid FROM enrollment e , course
c where e.cid(+)=c.cid
Or
SELECT c.cid, cname, sid FROM enrollment e RIGHT
OUTER JOIN course c on c.cid=e.cid
53
Full Outer Join
• Query 38: List all the courses and all the students. Link the
students and courses whenever appropriate.
SELECT c.cid, cname, sid, sname FROM enrollment e
FULL OUTER JOIN course c on c.cid=e.cid FULL
OUTER JOIN Student s on e.sid=s.sid
54
Self Join
• A table that needs to get results based on the values of its two
columns uses self–join.
• Assume that we have a table called customer.
CustID Name ReferredBy
1 Neeta Shyam
2 Dolly Dilly 1
3 Meena 2
Kumari
55
Self Join example
• Query 39: Find the names of all customers who have referred other
customers.
SELECT c1.name FROM Customer c1 JOIN Customer c2 ON
c1.custid=c2. ReferredBy