Databases - Relational Algebra
Jianxin Li
School of Computer Science & Software Engineering
University of Western Australia
Jianxin Li (UWA) Relational Algebra 1 / 28
Before we start
Recall that a relation (or table), consists of:
A schema which is an ordered list of attributes, with each attribute
composed of name and a domain (or type).
You can think of this as the header row for the table
A set of tuples (or rows), where each entry is a value of the appropriate
type or the special value NULL.
Jianxin Li (UWA) Relational Algebra 2 / 28
Example relations I
Consider a relation Student with three attributes
id of type CHAR(8)
name of type VARCHAR(64)
gender of type ENUM("M", "F", "X")
and a relation Grade also with three attributes
id of type CHAR(8)
unit of type CHAR(8)
grade of type INT
Jianxin Li (UWA) Relational Algebra 3 / 28
Example relations II
id name gender
12345678 Ebenezer Scrooge M
12345682 Jane Austen F
12345689 JMartin Chuzzlewit M
id unit grade
12345678 CITS1402 88
12345678 CITS2211 75
12345682 CITS1402 91
12345682 CITS2211 71
12345689 CITS1402 55
Jianxin Li (UWA) Relational Algebra 4 / 28
Relational Algebra
Relational algebra is the mathematical language describing the manipulation
of relations while SQL is an approximation to relational algebra.
There are two fundamental operators
Selection denoted by (sigma)
This operator selects a subset of the rows satisfying some condition
Projection denoted by (pi)
This operator projects onto a desired subset of the columns
Jianxin Li (UWA) Relational Algebra 5 / 28
Selection
Consider the expression
grade>80 Grade
This checks each row and keeps only the rows that satisfy the condition.
The result is another relation with the same schema, but fewer rows.
id unit grade
12345678 CITS1402 88
12345682 CITS1402 91
Jianxin Li (UWA) Relational Algebra 6 / 28
Projection
Consider the expression
id Student
This goes through each row, and only keeps the specified columns.
The result is another relation with fewer columns but in this case the
same number of rows.
id
12345678
12345682
12345689
Jianxin Li (UWA) Relational Algebra 7 / 28
MySQL - CREATE TABLE
First we create the (empty) tables:
CREATE TABLE Student (id CHAR(8),
name VARCHAR(64),
gender ENUM("M","F","X"));
CREATE TABLE Grade (id CHAR(8),
unit CHAR(8),
grade INT);
Jianxin Li (UWA) Relational Algebra 8 / 28
MySQL - INSERT INTO
Next we insert the initial data:
INSERT INTO Student VALUES(12345678, Ebenezer Scrooge, M);
INSERT INTO Student VALUES(12345682, Jane Austen, F);
INSERT INTO Student VALUES(12345689, Martin Chuzzlewit, M);
INSERT INTO Grade VALUES(12345678, CITS1402, 88);
INSERT INTO Grade VALUES(12345678, CITS2211, 75);
INSERT INTO Grade VALUES(12345682, CITS1402, 91);
INSERT INTO Grade VALUES(12345682, CITS2211, 71);
INSERT INTO Grade VALUES(12345689, CITS1402, 55);
Jianxin Li (UWA) Relational Algebra 9 / 28
MySQL - SELECT *
In relational algebra, an entire relation can be referred to just by its name:
Grade
In MySQL this is not a legal expression, and we must explicitly state that we
want all the columns from a table.
mysql> SELECT * from Grade;
+----------+----------+-------+
| id | unit | grade |
+----------+----------+-------+
| 12345678 | CITS1402 | 88 |
| 12345678 | CITS2211 | 75 |
| 12345682 | CITS1402 | 91 |
| 12345682 | CITS2211 | 71 |
| 12345689 | CITS1402 | 55 |
+----------+----------+-------+
5 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 10 / 28
Selection in MySQL
In MySQL a selection is accomplished by adding a WHERE clause containing
the conditions.
mysql> SELECT * FROM Grade WHERE grade > 80;
+----------+----------+-------+
| id | unit | grade |
+----------+----------+-------+
| 12345678 | CITS1402 | 88 |
| 12345682 | CITS1402 | 91 |
+----------+----------+-------+
2 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 11 / 28
Projection in MySQL
In MySQL a projection is accomplished by explicitly listing the columns you
want to keep.
mysql> SELECT id FROM Student;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
| 12345689 |
+----------+
3 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 12 / 28
Select and Project in MySQL
In relational algebra we can combine queries
id (grade>80 Grade)
This first selects only the rows with grade > 80 and then projects onto the
id column only.
mysql> SELECT id FROM Grade WHERE grade > 80;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
+----------+
2 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 13 / 28
Relations are sets . . .
While MySQL approximates relational algebra, it doesnt do it perfectly.
id Grade
should produce
id
12345678
12345682
12345689
because a relation is defined to be a set of tuples, so repeats are not allowed.
Jianxin Li (UWA) Relational Algebra 14 / 28
. . . but not in MySQL . . .
mysql> SELECT id FROM Grade;
+----------+
| id |
+----------+
| 12345678 |
| 12345678 |
| 12345682 |
| 12345682 |
| 12345689 |
+----------+
5 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 15 / 28
. . . unless you force it
mysql> SELECT DISTINCT id FROM Grade;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
| 12345689 |
+----------+
3 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 16 / 28
More conditions
In relational algebra, the selection criteria can be arbitrarily complicated.
grade>80(grade>70unit=0 CITS22110 ) Grade
Jianxin Li (UWA) Relational Algebra 17 / 28
More conditions
In relational algebra, the selection criteria can be arbitrarily complicated.
grade>80(grade>70unit=0 CITS22110 ) Grade
mysql> SELECT * FROM Grade WHERE
(grade > 80) OR
(grade > 70 AND unit = CITS2211);
+----------+----------+-------+
| id | unit | grade |
+----------+----------+-------+
| 12345678 | CITS1402 | 88 |
| 12345678 | CITS2211 | 75 |
| 12345682 | CITS1402 | 91 |
| 12345682 | CITS2211 | 71 |
+----------+----------+-------+
4 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 17 / 28
More columns
In relational algebra, the projection can pick out any number of columns
id,name Student
mysql> SELECT id, name FROM Student;
+----------+-------------------+
| id | name |
+----------+-------------------+
| 12345678 | Ebenezer Scrooge |
| 12345682 | Jane Austen |
| 12345689 | Martin Chuzzlewit |
+----------+-------------------+
3 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 18 / 28
Products and Joins
The Cartesian product of relational algebra
Student Grade
creates a new relation with 6 attributes, namely
id, name, gender, id, unit, grade
and with 3 5 = 15 rows obtained by gluing together a tuple from Student
and a tuple from Grade in every possible way.
Jianxin Li (UWA) Relational Algebra 19 / 28
Cartesian product in MySQL
mysql> SELECT * FROM Student, Grade;
+----------+-------------------+--------+----------+----------+-------+
| id | name | gender | id | unit | grade |
+----------+-------------------+--------+----------+----------+-------+
| 12345678 | Ebenezer Scrooge | M | 12345678 | CITS1402 | 88 |
| 12345682 | Jane Austen | F | 12345678 | CITS1402 | 88 |
| 12345689 | Martin Chuzzlewit | M | 12345678 | CITS1402 | 88 |
| 12345678 | Ebenezer Scrooge | M | 12345678 | CITS2211 | 75 |
| 12345682 | Jane Austen | F | 12345678 | CITS2211 | 75 |
| 12345689 | Martin Chuzzlewit | M | 12345678 | CITS2211 | 75 |
| 12345678 | Ebenezer Scrooge | M | 12345682 | CITS1402 | 91 |
| 12345682 | Jane Austen | F | 12345682 | CITS1402 | 91 |
| 12345689 | Martin Chuzzlewit | M | 12345682 | CITS1402 | 91 |
| 12345678 | Ebenezer Scrooge | M | 12345682 | CITS2211 | 71 |
| 12345682 | Jane Austen | F | 12345682 | CITS2211 | 71 |
| 12345689 | Martin Chuzzlewit | M | 12345682 | CITS2211 | 71 |
| 12345678 | Ebenezer Scrooge | M | 12345689 | CITS1402 | 55 |
| 12345682 | Jane Austen | F | 12345689 | CITS1402 | 55 |
| 12345689 | Martin Chuzzlewit | M | 12345689 | CITS1402 | 55 |
+----------+-------------------+--------+----------+----------+-------+
15 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 20 / 28
Matching them up
What we really want is for each row to combine the Student information
and the Grade information for the same student.
In relational algebra
Student.id=Grade.id Student Grade
This forms the Cartesian product, and then selects only the rows where the
two occurrences of id match.
Jianxin Li (UWA) Relational Algebra 21 / 28
Matching in MySQL
mysql> SELECT * FROM Student, Grade WHERE Student.id = Grade.id;
+----------+-------------------+--------+----------+----------+-------+
| id | name | gender | id | unit | grade |
+----------+-------------------+--------+----------+----------+-------+
| 12345678 | Ebenezer Scrooge | M | 12345678 | CITS1402 | 88 |
| 12345678 | Ebenezer Scrooge | M | 12345678 | CITS2211 | 75 |
| 12345682 | Jane Austen | F | 12345682 | CITS1402 | 91 |
| 12345682 | Jane Austen | F | 12345682 | CITS2211 | 71 |
| 12345689 | Martin Chuzzlewit | M | 12345689 | CITS1402 | 55 |
+----------+-------------------+--------+----------+----------+-------+
5 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 22 / 28
Natural Join
In relational algebra, the natural join automatically matches all columns with
the same name, and then removes one of each duplicate pair.
In relational algebra
Student ./ Grade
will give a relation with five columns and the five rows corresponding to each
student/unit combination.
Jianxin Li (UWA) Relational Algebra 23 / 28
In MySQL
mysql> SELECT * FROM Student NATURAL JOIN Grade;
+----------+-------------------+--------+----------+-------+
| id | name | gender | unit | grade |
+----------+-------------------+--------+----------+-------+
| 12345678 | Ebenezer Scrooge | M | CITS1402 | 88 |
| 12345678 | Ebenezer Scrooge | M | CITS2211 | 75 |
| 12345682 | Jane Austen | F | CITS1402 | 91 |
| 12345682 | Jane Austen | F | CITS2211 | 71 |
| 12345689 | Martin Chuzzlewit | M | CITS1402 | 55 |
+----------+-------------------+--------+----------+-------+
5 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 24 / 28
Set Expressions
Relational algebra has a full complement of set expressions such as
R1 R2 R1 R2 R1 R2
that apply to relations with the same schema.
In the SQL standard, these are specified with the keywords UNION,
INTERSECT and EXCEPT.
But MySQL does not implement the full SQL standard, and in particular does
not have INTERSECT or EXCEPT.
Jianxin Li (UWA) Relational Algebra 25 / 28
MySQL UNION
mysql> SELECT id FROM Student UNION SELECT id FROM Grade;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
| 12345689 |
+----------+
3 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 26 / 28
MySQL UNION ALL
mysql> SELECT id FROM Student UNION ALL SELECT id FROM Grade;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
| 12345689 |
| 12345678 |
| 12345678 |
| 12345682 |
| 12345682 |
| 12345689 |
+----------+
8 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 27 / 28
MySQL UNION
mysql> SELECT id FROM Student UNION SELECT grade FROM Grade;
+----------+
| id |
+----------+
| 12345678 |
| 12345682 |
| 12345689 |
| 88 |
| 75 |
| 91 |
| 71 |
| 55 |
+----------+
8 rows in set (0.00 sec)
Jianxin Li (UWA) Relational Algebra 28 / 28