4/1/2015
Outline
• Definition
• Update, insertion, and deletion Anomalies
Normalised Databases • Functional Dependencies
• Normalization Process
– First normal form(1NF)
– 2NF
Some of the slides are adopted from – 3NF
Ramakrishnan and Gehrke 2003
– BCNF, etc.
Normalization Normalization
• Normalization is a technique for producing a set • Suitable set of relations will be:
of suitable relations that support the data – easier for the user to access and maintain the
requirements of an enterprise. data;
• Relations should have: – take up minimal storage space on the computer.
– minimal number of attributes necessary to support • Normalization tries to minimize data
the data requirements of the enterprise;
redundancies
– attributes with a close logical relationship in the same
relation; • Redundancy is at the root of several problems
– minimal redundancy with each attribute represented associated with relational schemas:
only once with the important exception of attributes – redundant storage, insert/ delete/ update
that form all or part of foreign keys. anomalies
3 4
Anomalies in updating, insertion, and
Normalization
deletion
• Integrity constraints, in particular functional
dependencies, can be used to identify schemas • Consider the relation:
with such problems and to suggest refinements. EMP_PROJ ( EmpNo, ProjNo, Ename, Pname, No_hours)
• Normalization - Removing undesirable functional dependency
• Update Anomaly: Changing the name of project
• Main refinement technique: decomposition number P1 from “Billing” to “Customer-Accounting”
(replacing ABCD with, say, AB and BCD, or ACD may cause this update to be made for all 100
and ABD). employees working on project P1.
– Decomposition should be used judiciously:
• Is there reason to decompose a relation?
• What problems (if any) does the decomposition cause?
5 6
1
4/1/2015
Anomalies in updating, insertion, and
Video tape
deletion
• Insert Anomaly: Cannot insert a project unless an VideoID Purchas Title Genre Length
employee is assigned to . {PK} ed
– Inversely - Cannot insert an employee unless he/ she 1 25/1/99 Sooty Kids 1
is assigned to a project.
• Delete Anomaly: When a project is deleted, it will 2 26/2/99 Star Drama 2
result in deleting all the employees who work on Wars
that project. Alternately, if an employee is the
3 1/4/02 CM205 Adult 12
sole employee on a project, deleting that
employee would result in deleting the 4 2/5/02 Sooty Kids 1
corresponding project.
7 8
Anomalies Normalization...
• Insertion • Two important properties of decomposition.
– If we add a new copy of “Sooty” we must make – Lossless-join property enables us to find any
sure the details are consistent instance of the original relation from
• Update corresponding instances in the smaller relations.
– Change length of Sooty must be performed – Dependency preservation property enables us to
consistently on each copy enforce a constraint on the original relation by
• Delete enforcing some constraint on each of the smaller
– Deleting the last copy throws out the information relations.
on “Sooty”
9 10
Functional Dependencies
• Functional dependency describes relationship Functional Dependencies ctd.
between attributes.
– For example, if A and B are attributes of relation R, • Example:
B is functionally dependent on A (denoted AB), The following functional dependencies
if each value of A in R is associated with exactly appear to hold.
one value of B in R. – staffNo sName
• The determinant of a functional dependency – sName staffNo - may not hold all the time
refers to the attribute or group of attributes
on the left-hand side of the arrow, B is called
the dependent.
• The functional dependency is said to be trivial
if Y is a subset of X. 11 12
2
4/1/2015
Functional Dependencies ctd. Functional Dependencies ctd.
• Main characteristics of functional
• Full functional dependency indicates that if A dependencies used in normalization:
and B are attributes of a relation, B is fully
– There is a one-to-one relationship between the
functionally dependent on A, if B is attribute(s) on the left-hand side (determinant)
functionally dependent on A, but not on any and those on the right-hand side of a functional
proper subset of A. dependency.
– Holds for all time.
– The determinant has the minimal number of
attributes necessary to maintain the dependency
with the attribute(s) on the right hand-side.
13 14
Functional Dependencies PK Using Functional Dependencies
• Identifying Functional Dependencies • Main purpose of identifying a set of functional
– Identifying all functional dependencies between a dependencies for a relation is to specify the set of
set of attributes is relatively simple if the meaning integrity constraints that must hold on a relation.
of each attribute and the relationships between • An important integrity constraint to consider first
the attributes are well understood. is the identification of candidate keys, one of
– This information should be provided by the which is selected to be the primary key for the
enterprise in the form of discussions with users relation.
and/or documentation such as the
• All attributes that are not part of a candidate key
users’requirements specification.
should be functionally dependent on the key.
15 16
The Process of Normalization
The Process of Normalization
17 University of Botswana. CSI262 Lecture Notes 2011-2012 18
3
4/1/2015
Common Normal Forms UNF to 1NF
Unnormalized Form (UNF) • The domain of an attribute must contain only
• A table that contains one or more repeating atomic (indivisible values)
groups. • Disallows a relation within a relation
First Normal Form (1NF) • Remove the repeating group by
• A relation in which the intersection of each row – Entering appropriate data into the empty columns of
and column contains one and only one value. rows containing the repeating data (‘flattening’ the
table).
• To convert UNF to 1NF
• OR
– Nominate an attribute or group of attributes to act as
the key for the unnormalized table. – Placing the repeating data along with a copy of the
original key attribute(s) into a separate relation.
19 20
Name Children
Noah {Shem, Ham, Japeth} The problem is functional
dependency
Isaac {Jacob, Esau}
Noah Shem
Normalised
Table Noah Ham Identify functional dependency
And redesign tables
Noah Japeth
Isaac Jacob
Isaac Esau
21 22
Functional dependency Functional dependency
VideoID Purchas Title Genre Length
• In the functional dependency the set X is {PK} ed
called the determinant and the set Y is called 1 25/1/99 Sooty Kids 1
the dependent.
2 26/2/99 Star Drama 2
• The functional dependency is said to be trivial Wars
if Y is a subset of X. 3 1/4/02 CM205 Adult 12
4 2/5/02 Sooty Kids 1
videoid purchased , title, genre, length
title genre, length
23 24
4
4/1/2015
1NF to 2NF 1NF to 2NF
• For relations where primary key contains multiple
attributes, no non key attribute should be • Identify the primary key for the 1NF relation.
functionally dependent on part of the primary key • Identify the functional dependencies in the
• Decompose and set up a new table for each partial relation.
key with its dependent attributes. Make sure to keep • If partial dependencies exist on the primary
a table with the original primary key and attributes
that are fully functionally dependent on it.
key remove them by placing them in a new
relation along with a copy of their
• Based on the concept of full functional dependency.
determinant.
• A relation that is in 1NF and every nonprimary key
attribute is fully functionally dependent on the
primary key.
25 26
StudentID Course Title Mark Instructor StudentID Course Title Mark Instructor
1 Cm205 Database 45 Kst 1 Cm205 Database 45 Kst
2 Cm204 model 46 Apb 2 Cm204 model 46 Apb
3 Cm205 Database 55 Kst 3 Cm205 Database 55 Kst
1 Cm204 Model 66 Apb 1 Cm204 Model 66 apb
primary key studentid , course
course Title, instructor StudentID Course Mark
Course Title Instructor 1 Cm205 45
Functional dependence on part of primary key. Cm204 Model Apb
2 Cm204 46
Not in 2NF cm205 databases kst
3 Cm205 55
1 Cm204 66
27 28
Third Normal Form (3NF) 2NF to 3NF
• Based on the concept of transitive
• Second NF • Decompose and set up dependency.
• Relation should not have a new relation that
includes the nonkey • Transitive Dependency is a condition where A,
a nonkey attribute
functionally determined attribute(s) that B and C are attributes of a relation such that if
by another nonkey functionally determine AB and BC, then C is transitively
attribute the other attributes(s) dependent on A through B. (Provided that A is
• No transitive • ***Dependency not functionally dependent on B or C).
dependency preservation through
• A relation that is in 1NF and 2NF and in which
the “transitive”
determinant no non-primary-key attribute is transitively
dependent on the primary key.
29 30
5
4/1/2015
VideoID Purchased Title Genre Length
2NF to 3NF {PK}
1 25/1/99 Sooty Kids 1
• Identify the primary key in the 2NF 2
3
26/2/99
1/4/02
Star Wars
CM205
Drama
Adult
2
12
relation. 4 2/5/02 Sooty Kids 1
• Identify functional dependencies in the
relation. F unctional dependency
• If transitive dependencies exist on the title genre, length
primary key remove them by placing
Transitive Dependency?
them in a new relation along with a
copy of their dominant.
31 32
VideoID Purchased Title Genre Length
{PK}
1 25/1/99 Sooty Kids 1
General Definitions of 2NF and 3NF
2 26/2/99 Star Wars Drama 2
3 1/4/02 CM205 Adult 12 • Second normal form (2NF) - A relation that is
4 2/5/02 Sooty Kids 1
in first normal form and every non-primary-
key attribute is fully functionally dependent
VideoID Purchased Title
on any candidate key.
Title Genre Length
1 25/1/99 Sooty • Third normal form (3NF) - A relation that is in
2 26/2/99 Star Wars Sooty Kids 1
3 1/4/02 CM205
first and second normal form and in which no
CM205 Adult 12
4 2/5/02 Sooty non-primary-key attribute is transitively
dependent on any candidate key.
33 34
BOYCE-CODD NORMAL FORM 3NF to BCNF
35 36
6
4/1/2015
3NF to BCNF – Example Another Example
37 38
Remarks Summary
• BCNF is stronger than 3NF • Relations with data redundancy suffer from
• Fourth NF removes multi valued dependency update anomalies
• Functional dependency identified as the cause
• Fifth NF concerns lossless join dependency of the problem
• We will not cover 4NF, 5NF • Normalisation (2NF, 3NF, BCNF) characterises
various kinds of functional dependency
• Decompose into smaller tables removing the
functional dependency
39 40