Database Management Systems
INSY 5335
Zhuojun Gu
Assistant Professor
Information Systems and Operations Management
Today’s Agenda
• The logical view of data as organized in the relational model
– How the relational database’s basic data components fit into a logical construct, known
as a table.
• Keys
– How tables within a database can be related to one another using keys.
• Relational database operators
– These operators determines how we manipulate the database records using
SQL commands.
A Logical View of Data
• Entities, Attributes, and Entity Sets
– An Entity is a person, place, event, or thing for which we intend to
collect data.
• Example:
– An Attribute describes a characteristic of an entity.
• Example:
– A group of occurrences (instances) of the same type of entity is known
as an Entity Set.
• Example:
3
A Logical View of Data
• Table
– A Table contains a group of related entity occurrences
– i.e. an entity set.
– It is also called a relation (Term borrowed from Set theory).
• Characteristics of a Relational Table
– A table is composed of rows and columns.
– Each row (tuple) represents a single entity within the entity set.
– Each column represents an attribute and is identified by a distinct name.
4
A Logical View of Data
• Characteristics of a Relational Table
– Each row/column intersection represents a single data value (atomic).
– The number of tuples in a table is called its cardinality.
– The number of columns is known as its degree.
– All values in a column must conform to the same data format (type) and
must be within a specified range, known as the attribute domain.
– Changing the order of the rows and/or columns does not change the
table.
5
Example – Relational Table
6
Keys
• Functional Dependence
– Attribute B is functionally dependent on attribute A (attribute A
determines attribute B: A -> B) if all the rows in a table that agree
in value for attribute A must also agree in value for attribute B.
• Example:
• A key consists of one or more attributes that determines
other attributes within an entity.
– Example:
7
Keys
• A key that is composed of more than one attributes is known as a
composite key.
– Example:
• If attribute B is functionally dependent on a composite key A but not
any subset of A then B is fully functionally dependent on A.
– Example:
8
Keys
• Superkey: An attribute or a combination of attributes that uniquely
identifies each row in a table.
– Example:
• Candidate Key: A minimal superkey, i.e., it does not contain a
subset of attributes that is itself a superkey.
– Example:
9
Keys
• Primary Key
– A candidate key selected to uniquely identify an entity.
– Cannot have null values (A null value is no value, it is NOT equal to a
zero or a blank space).
– Enforces Entity Integrity (Guarantees that each entity is uniquely
identified by a non-null primary key value)
– A primary key is a superkey as well as a candidate key.
• Example:
10
Keys
• Foreign Key
– An attribute (or a combination of attributes) in one table whose values
must either match the primary key values in a designated table or be
null.
– Used to logically link one table with another (compare with the physical
pointers in Hierarchical and Network models).
– Enforces Referential Integrity (Guarantees valid references to another
table, i.e., cannot delete a tuple from a table that is referenced by in
another table through a foreign key).
11
Example – Integrity Constraints
12
Example – Integrity Constraints
13
Secondary Key
• Secondary Key
– Used for data retrieval purpose.
– May consist of a single attribute or a combination of attributes.
– The DBMS maintains indexes on secondary keys for faster
search and retrieval of data.
– May have duplicate values.
• Example:
14
Relational Database Operators
• These operators are based on relational algebra theory.
• They define functions to manipulate data in one or more tables (relations).
• Application of a relational operator to one or more tables results in another
table.
• The eight relational operators are: UNION, INTERSECT, DIFFERENCE,
PRODUCT, SELECT, PROJECT, JOIN, and DIVIDE.
• To be considered minimally relational, a DBMS must support SELECT,
PROJECT and JOIN
15
Relational Database Operators
• Union Compatibility
– Two tables are said to be union-compatible when they have the
same degree (number of attributes), say n, and the jth attributes
(j in the range of 1 to n) of the two tables are drawn from the
same domain (they need not have the same name).
• Example:
16
Union
• The tables must be union compatible
• A Union B results in C that contains all tuples from both A and B
with no duplicates.
– Example:
A (All Insy) B (All Fina)
Name Major Gpa Name Major Gpa
John Insy 3.4 Jack Fina
4.0
Joe Insy 3.7 Jeb
Fina 2.5
C (A Union B)
Name Major Gpa
John Insy 3.4
Joe Insy 3.7
Jack Fina 4.0 17
Jeb Fina 2.5
Intersect
• The tables must be union compatible
• A Intersect B results in C that contains tuples that are common to
both A and B.
– Example:
A (All Insy) B (High
Achievers)
Name Major Gpa Name Major Gpa
John Insy 3.4 Jack
Fina 4.0
Joe Insy 3.7 Jill
Insy 4.0
Jill Insy 4.0 Jeb
Mktg 3.98
Bob Insy 2.7
C (A INTERSECT B)
Name Major Gpa 18
Jill Insy 4.0
Difference
• The tables must be union compatible
• A MINUS B results in C that contains the tuples that appear in A but
not in B.
– Example:
A (All Insy) B (High
Achievers)
Name Major Gpa Name Major Gpa
John Insy 3.4 Jack
Fina 4.0
Joe Insy 3.7 Jill
Insy 4.0
Jill Insy 4.0 Jeb
Mktg 3.98
Bob Insy 2.7
C (A Minus B)
Name Major Gpa
19
John Insy 3.4
Product
● PRODUCT produces a list of all possible pairs of rows from two
tables.
● If table A has 5 rows and B has 10, A product B will yield a table
with 50 rows.
‒ Example:
A (Item) B (Supplier)
Inumber IName SName
SCity
101 Sweat Shirt WalMart
Dallas
205 Trousers Kmart
Phoenix
C (A Product B)
Inumber IName SName
SCity
101 Sweat Shirt WalMart
20
Dallas
Select
● SELECT yields all attributes of selected tuples that
satisfy a specified condition.
● It produces a horizontal subset of a table.
21
Project
● PROJECT produces a list of all values for selected
attributes.
● It yields a vertical subset of a table.
Join
● JOIN allows us to combine information from two or more tables.
● The tables participating in the join operation must have attributes
defined over a common domain.
● EquiJoin: Compares specified columns of two tables based on
equality condition. The result is a wider table where each row is
formed by concatenating two rows, one from each table, such that
the two rows have the same values in these two columns.
‒ Performed by a Product followed by a Select.
23
JOIN - Example
24
JOIN - Example
25
JOIN - Example
26
JOIN - Example
27
Join
● Natural join: EquiJoin with the duplicate column removed.
Performed by a Project on the result of equijoin. (aka Inner Join)
– When the term Join is mentioned without any prefix, it is implied to be
Natural Join.
● Outer Join: Unmatched rows from the participating tables are
retained in the result table with unmatched attributes left blank or
null.
– Left Outer join keeps all tuples from the left relation
– Right Outer Join keeps all tuples from the right relation
● Theta Join: EquiJoin with the equality operator replaced by any
other comparison operator, such as greater than, less than, etc.
28
Left Outer Join - Example
29
Right Outer Join - Example
30
Divide
• Consider dividing a relation A with two attributes X and Y by a
relation B with a single attribute Y.
• Note that attribute Y is common to both A and B.
• A can be thought of as a set of pairs of values <x,y> and B
as a set of single values <y>.
• The result of dividing A by B is C, a set of values of x such
that the pair <x,y> appears in A for all values of y appearing
in B.
• In general relation A can be of degree m+n, and relation B
can be of degree n.
31
Divide
• Examples of Divide
A (Supplier-Part) B1 (Part) C1(A Divided by B1)
Sup# Part# Part# Sup#
S1 P1 P1 S1
S1 P2 S2
S1 P3
S1 P4 B2 (Part) C2 (A Divided by
B2)
S1 P5 Part# Sup#
S1 P6 P2 ?
S2 P1 P4
S2 P2
S3 P2 B3 (Part) C3 (A Divided
by B3)
S4 P2 Part# Sup#
S4 P4 P1 ?
S4 P5 P2
32
P3
The Data Dictionary
• Data dictionary contains metadata that describes the data stored in
the database.
• It stores:
– the names of the data items in the database
– the types and sizes of the data items
– the constraints on each data item
– the names of authorized users, the data items that each user can
access, and the types of access allowed.
33
Example - Data Dictionary
34
Data Redundancy Revisited
• The relational database model does not completely eliminate data
redundancy but uses controlled data redundancy.
• Foreign keys create redundant data, but serve the useful purpose of
maintaining referential integrity.
• They are also used in Join operations.
• Sometimes user requirements demand storing apparently redundant
data, such as storing the sale price in Invoice Line Item. A deeper
analysis reveals no redundancy (See the Invoice example).
35
Data Redundancy Revisited
Data Redundancy Revisited
• In this example Product Price is copied from the PRODUCT table to the LINE table.
Does this example actually carry redundant data?