SQL and Relational Algebra
What is SQL
• SQL = Structured Query Language (pronounced “sequel”).
• Language for defining as well as querying data in an RDBMS.
• Primary mechanism for querying and modifying the data in an
RDBMS.
• SQL is declarative:
– Say what you want to accomplish, without specifying how.
– One of the main reasons for the commercial success of RDMBSs.
• SQL has many standards and implementations:
– ANSI SQL
– SQL-92/SQL2 (null operations, outerjoins)
– SQL-99/SQL3 (recursion, triggers, objects)
– Vendor-specific variations.
Relational Algebra
• Relational algebra is a notation for specifying queries about
the contents of relations.
• Notation of relational algebra eases the task of reasoning
about queries.
• Operations in relational algebra have counterparts in SQL.
What is an Algebra?
• An algebra is a set of operators and operands.
– Arithmetic: operands are variables and constants, operators are
+,−,×,÷, /, etc.
– Set algebra: operands are sets and operators are ∩, U, -
• An algebra allows us to construct expressions by combining
operands and expression using operators and has rules for
reasoning about expressions.
• a² + 2 × a × b + 2b, (a + b)².
• R − (R − S), R ∩ S.
Basics of Relational Algebra
• Operands are relations, thought of as sets of tuples.
• Think of operands as variables, whose tuples are unknown.
• Results of operations are also sets of tuples.
• Think of expressions in relational algebra as queries, which
construct new relations from given relations.
• Four types of operators:
– Select/Show parts of a single relation: projection and selection.
– Usual set operations (union, intersection, difference).
– Combine the tuples of two relations, such as cartesian product
and joins.
– Renaming.
Projection
• The projection operator produces from a relation R a new
relation containing only some of R’s columns.
• “Delete” (i.e. not show) attributes not in projection list.
• Duplicates eliminated
• To obtain a relation containing only the columns A1,A2, . . . An
of R
RA: π A1,A2, . . . An (R)
SQL: SELECT A1,A2, . . . An FROM R;
Projection Example
sid sname rating age
S2
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sname,rating( S 2 ) age(S2)
sname rating
age
yuppy 9 35.0
lubber 8 55.5
guppy 5
rusty 10
Selection
• The selection operator applied to a relation R produces a new
relation with a subset of R’s tuples.
• The tuples in the resulting relation satisfy some condition C
that involves the attributes of R.
– with duplicate removal
RA: σc (R)
SQL: SELECT *FROM R WHERE C;
• The WHERE clause of a SQL command corresponds to σ( ).
Selection: Syntax of Conditional
• Syntax of conditional (C): similar to conditionals in
programming languages.
• Values compared are constants and attributes of the relations
mentioned in the FROM clause.
• We may apply usual arithmetic operators to numeric values
before comparing them.
RA Compare values using standard arithmetic operators.
SQL Compare values using =, <>, <, >, <=, >=.
Selection Example
sid sname rating age
S2
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
rating 8(S2)
sid sname rating age
28 yuppy 9 35.0
58 rusty 10 35.0
sname,rating( rating 8(S2))
sname rating
yuppy 9
rusty 10
Combining Operators
Union
• The union of two relations R and S is the set of tuples that are
in R or in S or in both.
– R and S must have identical sets of attributes and the types of
the attributes must be the same.
– The attributes of R and S must occur in the same order.
• What is the schema of the result ?
RA: R U S
SQL: (SELECT * FROM R)
UNION
(SELECT * FROM S);
Union
sid sname rating age
22 dustin 7 45.0
S1 31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
S2 28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1S 2
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
44 guppy 5 35.0
28 yuppy 9 35.0
Intersection
• The intersection of two relations R and S is the set of tuples
that are in both R and S.
• Same conditions hold on R and S as for the union operator.
– R and S must have identical sets of attributes and the types of the
attributes must be the same.
– The attributes of R and S must occur in the same order.
RA: R ∩ S
SQL: (SELECT * FROM R)
INTERSECT
(SELECT * FROM S);
Intersection
S1 S2
sid sname rating age sid sname rating age
28 yuppy 9 35.0
22 dustin 7 45.0
31 lubber 8 55.5
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
58 rusty 10 35.0
S1 S2
sid sname rating age
31 lubber 8 55.5
58 rusty 10 35.0
Difference
• The difference of two relations R and S is the set of tuples that
are in R but not in S.
• Same conditions hold on R and S as for the union operator.
– R and S must have identical sets of attributes and the types of the
attributes must be the same.
– The attributes of R and S must occur in the same order.
RA: R - S
SQL: (SELECT * FROM R)
EXCEPT
(SELECT * FROM S);
• R – (R – S) = R ∩ S
Difference
S1 S2
sid sname rating age sid sname rating age
22 dustin 7 45.0 28 yuppy 9 35.0
31 lubber 8 55.5 31 lubber 8 55.5
58 rusty 10 35.0 44 guppy 5 35.0
58 rusty 10 35.0
S1 S2
sid sname rating age
22 dustin 7 45.0
Cartesian Product
• The Cartesian product (or cross-product or product) of two
relations R and S is a the set of pairs that can be formed by
pairing each tuple of R with each tuple of S.
– The result is a relation whose schema is the schema for R
followed by the schema for S.
RA: R X S
SQL: SELECT * FROM R , S ;
Cartesian Product
S1 R1
sid sname rating age sid bid day
22 dustin 7 45.0
22 101 10/10/96
31 lubber 8 55.5
58 103 11/12/96
58 rusty 10 35.0
(sid) sname rating
S1 x R1
age (sid) bid day ?
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
We rename attributes to avoid ambiguity or we prefix
attribute with the name of the relation it belongs to.
Theta-Join
• The theta-join of two relations R and S is the set of tuples in
the Cartesian product of R and S that satisfy some condition C.
RA: R ∞ S
C
SQL: SELECT *
FROM R , S
WHERE C;
• R∞
C S =
σ
C
(R x S)
Theta-Join
S1 R1
sid sname rating age sid bid day
22 dustin 7 45.0 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0
S1 R1
S1.sid R1.sid
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
R c S c ( R S)
Natural Join
• The natural join of two relations R and S is a set of pairs of
tuples, one from R and one from S, that agree on whatever
attributes are common to the schemas of R and S.
• The schema for the result contains the union of the attributes
of R and S.
• Assume the schemas R(A,B, C) and S(B, C,D)
RA: R ∞ S
SQL: SELECT *
FROM R , S
WHERE R.B = S.B AND R.C = S.C;
Operators Covered So far
Renaming
• If two relations have the same attribute, disambiguate the
attributes by prefixing the attribute with the name of the
relation it belongs to.
• How do we answer the query “Name pairs of students who
live at the same address”? Students(Name, Address)
– We need to take the cross-product of Students with itself?
– How do we refer to the two “copies” of Students?
– Use the rename operator.
RA: S (A1,A2, . . . An)(R) : give R the name S; R has n attributes, which are
called A1,A2, . . . ,An in S
SQL: Use the AS keyword in the FROM clause: Students AS Students1
renames Students to Students1.
SQL: Use the AS keyword in the SELECT clause to rename attributes.
Renaming
• Are these correct ?
• No !!! the result includes tuples where a student is paired
with himself/herself
• Solution: Add the condition S1.name <> S2.name.
Practicing Relational Algebra
Q1: Find names of sailors who have
reserved boat #103
Reserves(sid, bid, day)
Sailors(sid, sname, rating, age)
• Solution 1:
πsname(σbid = 103 (Reserves ∞ Sailors))
• Solution 2 (more efficient)
πsname((σbid = 103 Reserves) ∞ Sailors)
• Solution 3 (using rename operator)
P(Temp1 (σbid = 103 Reserves))
P(Temp2 (Temp1 ∞ Sailors))
πsname(Temp2)
Q2: Find names of sailors who have
reserved a red boat
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
• Solution 1:
πsname((σcolor = ‘red’ Boats) ∞ Reserves ∞ Sailors )
• Solution 2 (more efficient)
πsname(πsid ((πbidσcolor = ‘red’ Boats)∞ Reserves )∞ Sailors )
Q3: Find the colors of boats
reserved by Lubber
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
• Solution:
πcolor((σsname = ‘Lubber’ Sailor)∞ Reserves ∞ Boats )
Q4: Find the names of sailors who
have reserved at least one boat
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
• Solution:
πsname(Sailor∞ Reserves)
Q5: Find the names of sailors who
have reserved a red or a green boat
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
• Solution:
πsname(σcolor=‘red’ or color = ‘green’ Boats ∞ Reserves ∞ Sailors)
Q6: Find the names of sailors who
have reserved a red and a green boat
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
• Solution:
πsname(σcolor=‘red’ and color = ‘green’ Boats ∞ Reserves ∞ Sailors)
A ship cannot have TWO colors at the same time
πsname(σcolor=‘red’ Boats ∞ Reserves ∞ Sailors)
∩
πsname(σcolor = ‘green’ Boats ∞ Reserves ∞ Sailors)
Q7: Find the sids of sailors with age over
20 who have not reserved a red boat
Reserves(sid, bid, day) Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Strategy ? ? ?
Find all sailors (sids) with age over 20
Find all sailors (sids) who have reserved a red boat
Take their set difference
• Solution:
πsid (σage>20 Sailors) – πsid ((σcolor=‘red’ Boats) ∞ Reserves)