SESSION 5:
NORMALISATION
By
Hans Seegobin
Objectives
• This session covers the following topics:
– What normalization is and what role it plays in the
database design process.
– Data Anomaly
– About the normal forms 1NF, 2NF, 3NF
– How normal forms can be transformed from lower normal
forms to higher normal forms
– That normalization and ER modeling are used concurrently
to produce a good database design
2
Database Tables and Normalization
• Normalization
– Process for evaluating and correcting table
structures to minimize data redundancies
• Reduces data anomalies
– Works through a series of stages called normal
forms:
• First normal form (1NF)
• Second normal form (2NF)
• Third normal form (3NF)
3
Problems With Unnormalised tables
• Deletion Anomaly
• Update Anomaly
• Insertion Anomaly
4
Un-normalise table
Modules
ModuleName Staff No Staff Name StudentNO StudentName ass ass
Grade type
Relational Database Systems 234 Davies T 34698 Smith S B3 cwk1
Relational Database Systems 234 Davies T 34698 Smith S B1 cwk2
Relational Database Systems 234 Davies T 37798 Jones S B2 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B1 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B3 cwk2
Relational Database Design 234 Davies T 34698 Smith S B2 cwk1
Relational Database Design 234 Davies T 34698 Smith S B3 cwk2
Deductive Database 345 Evans R 34668 Smith J A1 exam
5
Deletion Anomaly
• What if we wish to delete student 34668?
Modules
Other Information Deleted
ModuleName Staff No Staff Name StudentNO StudentName ass ass
Grade type
Relational Database Systems 234 Davies T 34698 Smith S B3 cwk1
Relational Database Systems 234 Davies T 34698 Smith S B1 cwk2
Relational Database Systems 234 Davies T 37798 Jones S B2 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B1 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B3 cwk2
Relational Database Design 234 Davies T 34698 Smith S B2 cwk1
Relational Database Design 234 Davies T 34698 Smith S B3 cwk2
Deductive Database 345 Evans R 34668 Smith J A1 exam
6
Update Anomaly
What if we change the lecturer of Relational database
system?
Need to change all these fields
Modules
ModuleName Staff No Staff Name StudentNO StudentName ass ass
Grade type
Relational Database Systems 234 Davies T 34698 Smith S B3 cwk1
Relational Database Systems 234 Davies T 34698 Smith S B1 cwk2
Relational Database Systems 234 Davies T 37798 Jones S B2 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B1 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B3 cwk2
Relational Database Design 234 Davies T 34698 Smith S B2 cwk1
Relational Database Design 234 Davies T 34698 Smith S B3 cwk2
Deductive Database 345 Evans R 34668 Smith J A1 exam
7
Insertion Anomaly
What if we admit a new student on to a module?
Modules
Cannot insert record without knowing these fields
ModuleName Staff No Staff Name StudentNO StudentName ass ass
Grad type
e
Relational Database Systems 234 Davies T 34698 Smith S B3 cwk1
Relational Database Systems 234 Davies T 34698 Smith S B1 cwk2
Relational Database Systems 234 Davies T 37798 Jones S B2 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B1 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B3 cwk2
Relational Database Design 234 Davies T 34698 Smith S B2 cwk1
Relational Database Design 234 Davies T 34698 Smith S B3 cwk2
Deductive Database 345 Evans R 34668 Smith J A1 exam
8
The Need for Normalization
• Example: Company that manages building
projects
– Charges its clients by billing hours spent on each
contract
– Hourly billing rate is dependent on employee’s
position
– Periodically, report is generated that contains
information displayed in Table 5.1
9
The Need for Normalization
10
The Need for Normalization
11
The Normalization Process
• Each table represents a single subject
• No data item will be unnecessarily stored in
more than one table
• All attributes in a table are dependent on the
primary key
12
The Normalization Process
(continued)
13
Conversion to First Normal Form
• Repeating group
– Derives its name from the fact that a group of
multiple entries of same type can exist for any
single key attribute occurrence
• Relational table must not contain repeating
groups
• Normalizing table structure will reduce data
redundancies
• Normalization is three-step procedure
14
Conversion to First Normal Form
(continued)
• Step 1: Eliminate the Repeating Groups
– Present data in tabular format, where each cell
has single value and there are no repeating groups
– Eliminate repeating groups, eliminate nulls by
making sure that each repeating group attribute
contains an appropriate data value
15
Conversion to First Normal Form (continued)
16
Conversion to First Normal Form (continued)
• Step 2: Identify the Primary Key
– Primary key must uniquely identify attribute value
– New key must be composed
17
Conversion to First Normal Form (continued)
• Step 3: Identify All Dependencies
– Dependencies can be depicted with help of a
diagram
– Dependency diagram:
• Depicts all dependencies found within given table
structure
• Helpful in getting bird’s-eye view of all relationships
among table’s attributes
• Makes it less likely that will overlook an important
dependency
18
Conversion to First Normal Form (continued)
19
Conversion to First Normal Form
(continued)
20
Conversion to First Normal Form
(continued)
• First normal form describes tabular format in which:
– All key attributes are defined
– There are no repeating groups in the table
– All attributes are dependent on primary key
• All relational tables satisfy 1NF requirements
• Some tables contain partial dependencies
– Dependencies based on only part of the primary key
– Sometimes used for performance reasons, but should be
used with caution
– Still subject to data redundancies
21
Conversion to Second Normal Form
• Relational database design can be improved
by converting the database into second
normal form (2NF)
• Two steps
22
Conversion to Second Normal Form
(continued)
• Step 1: Write Each Key Component
on a Separate Line
– Write each key component on separate line, then
write original (composite) key on last line
– Each component will become key in new table
23
Conversion to Second Normal Form
(continued)
• Step 2: Assign Corresponding Dependent
Attributes
– Determine those attributes that are dependent on
other attributes
– At this point, most anomalies have been
eliminated
24
Conversion to Second Normal Form (continued)
25
Conversion to Second Normal Form (continued)
• Table is in second normal form (2NF) when:
– It is in 1NF and
– It includes no partial dependencies:
• No attribute is dependent on only portion of primary
key
26
Conversion to Third Normal Form
• Data anomalies created are easily eliminated
by completing three steps
• Step 1: Identify Each New Determinant
– For every transitive dependency, write its
determinant as PK for new table
• Determinant
– Any attribute whose value determines other values within a
row
27
Conversion to Third Normal Form (continued)
• Step 2: Identify the Dependent Attributes
– Identify attributes dependent on each
determinant identified in Step 1 and identify
dependency
– Name table to reflect its contents and function
28
Conversion to Third Normal Form (continued)
• Step 3: Remove the Dependent Attributes
from Transitive Dependencies
– Eliminate all dependent attributes in transitive
relationship(s) from each of the tables that have
such a transitive relationship
– Draw new dependency diagram to show all tables
defined in Steps 1–3
– Check new tables as well as tables modified in
Step 3 to make sure that each table has
determinant and that no table contains
inappropriate dependencies
29
Conversion to Third Normal Form
(continued)
30
Conversion to Third Normal Form
(continued)
• A table is in third normal form (3NF) when
both of the following are true:
– It is in 2NF
– It contains no transitive dependencies
31
The Bracketing Notation
• To represent the relational schema in an implementation-
independent form we use a notation sometimes known as the
bracketing notation.
• List a suitable mnemonic name for the table first.
• Followed by a list of data items or column names delimited by
commas.
• List the primary key first and underline this item.
• If primary key is made up of two or more attributes, then
underline all the component data items.
• Example:
32
Improving the Design
• Table structures are cleaned up to eliminate
troublesome initial partial and transitive
dependencies
• Normalization cannot, by itself, be relied on to
make good designs
• It is valuable because its use helps eliminate
data redundancies
33
Improving the Design (continued)
• Issues to address in order to produce a good
normalized set of tables:
– Evaluate PK Assignments
– Evaluate Naming Conventions
– Refine Attribute Atomicity
– Identify New Attributes
– Identify New Relationships
– Refine Primary Keys as Required for Data Granularity
– Maintain Historical Accuracy
– Evaluate Using Derived Attributes
34
Question & Answer?
35
Exercise 5- 1
• Business Example:
– Identify the functional dependencies in the given table
– Identify the Partial dependencies
– Identify the Transitive dependency
– Normalise the table to 3NF
– Draw the ERD of the system
36
Unnormalised table
Film Film Cinema Cinema Town Population Man. Man. Film
No Name Code Name No Name Takings
25 Star Wars BX Rex Cardiff 300000 01 Jones 900
25 Star Wars KT Rialto Swansea 200000 03 Thomas 350
25 Star Wars DJ Odeon Newport 250000 01 Jones 800
50 Jaws BX Rex Cardiff 300000 01 Jones 1200
50 Jaws DJ Odeon Newport 250000 01 Jones 400
50 Jaws TL Rex Bridgend 150000 02 Davies 300
50 Jaws RP Grand Bristol 350000 04 Smith 1500
50 Jaws HF State Bristol 350000 04 Smith 1000
30 Star Trek BX Rex Cardiff 300000 01 Jones 850
30 Star Trek TL Rex Bridgend 150000 02 Davies 500
40 ET KT Rialto Swansea 200000 03 Thomas 1200
40 ET RP Grand Bristol 350000 04 Smith 2000
37
Exercise 5- 2
• Business Example:
– Identify the functional dependencies in the given table
– Identify the Partial dependencies
– Identify the Transitive dependency
– Normalise the table to 3NF
– Draw the ERD of the system
38
Un-normalise table
Modules
ModuleName Staff No Staff Name StudentNO StudentName ass ass
Grade type
Relational Database Systems 234 Davies T 34698 Smith S B3 cwk1
Relational Database Systems 234 Davies T 34698 Smith S B1 cwk2
Relational Database Systems 234 Davies T 37798 Jones S B2 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B1 cwk1
Relational Database Systems 234 Davies T 34888 Patel P B3 cwk2
Relational Database Design 234 Davies T 34698 Smith S B2 cwk1
Relational Database Design 234 Davies T 34698 Smith S B3 cwk2
Deductive Database 345 Evans R 34668 Smith J A1 exam
39