1 nrnm-UiTM Perak
Objectives
2
In this chapter, students will learn:
What normalization is and what role it plays in the
database design process
About the normal forms 1NF, 2NF, and 3NF
How normal forms can be transformed from lower
normal forms to higher normal forms
That normalization and ER modeling are used
concurrently to produce a good database design
That some situations require denormalization to
generate information efficiently
nrnm-UiTM Perak
Database Tables and Normalization
3
Normalization
Process for evaluating and correcting table structures to
minimize data redundancies
◼ Reduces data anomalies
Series of stages called normal forms:
◼ Firstnormal form (1NF)
◼ Second normal form (2NF)
◼ Third normal form (3NF)
nrnm-UiTM Perak
Database Tables and Normalization
4
(cont’d.)
Normalization (continued)
2NF is better than 1NF; 3NF is better than 2NF
For most business database design purposes, 3NF is as
high as needed in normalization
Highest level of normalization is not always most
desirable
Denormalization produces a lower normal form
Increased performance but greater data redundancy
nrnm-UiTM Perak
The Need for Normalization
5
Example: company that manages building projects
Charges its clients by billing hours spent on each
contract
Hourly billing rate is dependent on employee’s position
Periodically, report is generated that contains
information such as displayed in Table 6.1
nrnm-UiTM Perak
6
nrnm-UiTM Perak
7
nrnm-UiTM Perak
The Need for Normalization (cont’d.)
8
Structure of data set in Figure 6.1 does not handle
data very well
Report may yield different results depending on
what data anomaly has occurred
Relational database environment is suited to help
designer avoid data integrity problems
nrnm-UiTM Perak
Normalization Process
Objective is to ensure that each table conforms to
the concept of well-formed relations
Each table represents a single subject
No data item will be unnecessarily stored in more than
one table
All nonprime attributes in a table are dependent on the
primary key
Each table is void of insertion, update, and deletion
anomalies
Normalization Process
Ensures that all tables are in at least 3NF
Higher forms are not likely to be encountered in
business environment
Normalization works one relation at a time
Starts by:
Identifying the dependencies of a relation (table)
Progressively breaking the relation into new set of
relations
11 nrnm-UiTM Perak
12 nrnm-UiTM Perak
Types of Functional Dependencies
Partial dependency: Functional dependence in
which dependent on only a portion of the
primary key
Easy to identify
Transitive dependency: A nonkey/nonprime
attribute functionally depends on another
nonkey attribute
Unnormalized form
14
Contain Repeating group
A group of multiple (related) entries can exist for
any single key attribute occurrence
Existence proves the presence of data redundancies
In table structure- use symbol { }
PROJECT(Proj_num,proj_name,{Emp_num,emp_na
me,job_class,Chg_Hour,Hours})
Relational table must not contain repeating groups
nrnm-UiTM Perak
Conversion to First Normal Form
15
Step 1: Eliminate the Repeating Groups
Eliminate
nulls: each repeating group attribute contains
an appropriate data value
Step 2: Identify the Primary Key
Mustuniquely identify attribute value
New key must be composed
Step 3: Identify All Dependencies
Dependencies are depicted with a diagram
nrnm-UiTM Perak
16
PROJECT(Proj_num,proj_name, Emp_num,emp_name,job_class,Chg_Hour,Hours)
nrnm-UiTM Perak
Conversion to First Normal Form
17
(cont’d.)
Dependency diagram:
Depicts all dependencies found within given table
structure
Helpful in getting bird’s-eye view of all relationships
among table’s attributes
Makes it less likely that you will overlook an important
dependency
nrnm-UiTM Perak
18
nrnm-UiTM Perak
Conversion to First Normal Form
19
(cont’d.)
First normal form describes tabular format:
All key attributes are defined
No repeating groups in the table
All attributes are dependent on primary key
All relational tables satisfy 1NF requirements
Some tables contain partial dependencies
Dependencies are based on part of the primary key
Should be used with caution
nrnm-UiTM Perak
Conversion to Second Normal Form
20
Step 1: Make New Tables to Eliminate Partial
Dependencies
Write each key component on separate line, then write
original (composite) key on last line
Each component will become key in new table
Step 2: Assign Corresponding Dependent Attributes
Determine attributes that are dependent on other
attributes
At this point, most anomalies have been eliminated
nrnm-UiTM Perak
21
nrnm-UiTM Perak
Conversion to Second Normal Form
22
(cont’d.)
Table is in second normal form (2NF) when:
It is in 1NF and
It includes no partial dependencies:
◼ No attribute is dependent on only portion of primary key
nrnm-UiTM Perak
Conversion to Third Normal Form
23
Step 1: Make New Tables to Eliminate Transitive
Dependencies
For every transitive dependency, write its determinant
as PK for new table
Determinant: any attribute whose value determines
other values within a row
nrnm-UiTM Perak
Conversion to Third Normal Form
24
(cont’d.)
Step 2: Reassign Corresponding Dependent
Attributes
Identify attributes dependent on each determinant
identified in Step 1
◼ Identify dependency
Name table to reflect its contents and function
nrnm-UiTM Perak
25
nrnm-UiTM Perak
Conversion to Third Normal Form
26
(cont’d.)
A table is in third normal form (3NF) when both of
the following are true:
It is in 2NF
It contains no transitive dependencies
nrnm-UiTM Perak
Improving the Design
27
Table structures should be cleaned up to eliminate
initial partial and transitive dependencies
Normalization cannot, by itself, be relied on to
make good designs
Valuable because it helps eliminate data
redundancies
nrnm-UiTM Perak
Improving the Design
Evaluate PK assignments and naming conventions
Refine attribute atomicity
Atomic attribute: Cannot be further subdivided
Atomicity: Characteristic of an atomic attribute
Identify new attributes and new relationships
Refine primary keys as required for data granularity
Granularity: Level of detail represented by the values
stored in a table’s row
Maintain historical accuracy and evaluate using derived
attributes
Figure 6.6 - The Completed Database
30
nrnm-UiTM Perak
Surrogate Key Considerations
Used by designers when the primary key is
considered to be unsuitable
System-defined attribute
Created an managed via the DBMS
Have a numeric value which is automatically
incremented for each new row
Surrogate Key Considerations
32
Data entries in Table 6.4 are inappropriate
because they duplicate existing records
No violation of entity or referential integrity
nrnm-UiTM Perak
Normalization and Database Design
33
Normalization should be part of the design process
Make sure that proposed entities meet required
normal form before table structures are created
Many real-world databases have been improperly
designed or burdened with anomalies
You may be asked to redesign and modify existing
databases
nrnm-UiTM Perak
Normalization and Database Design
34
(cont’d.)
ER diagram
Identify relevant entities, their attributes, and their
relationships
Identify additional entities and attributes
Normalization procedures
Focus on characteristics of specific entities
Micro view of entities within ER diagram
Difficult to separate normalization process from ER
modeling process
nrnm-UiTM Perak
35
nrnm-UiTM Perak
36
nrnm-UiTM Perak
37
nrnm-UiTM Perak
38
nrnm-UiTM Perak
39
nrnm-UiTM Perak
Denormalization
Design goals
Creation of normalized relations
Processing requirements and speed
Number of database tables expands when tables are
decomposed to conform to normalization requirements
Joining a larger number of tables:
Takes additional input/output (I/O) operations and
processing logic
Reduces system speed
Denormalization (cont’d.)
41
Conflicts are often resolved through compromises
that may include denormalization
Defects of unnormalized tables:
Data updates are less efficient because tables are
larger
Indexing is more cumbersome
No simple strategies for creating virtual tables known
as views
nrnm-UiTM Perak
Data-Modeling Checklist
42
Data modeling translates specific real-world
environment into data model
Represents real-world data, users, processes,
interactions
Data-modeling checklist helps ensure that data-
modeling tasks are successfully performed
Based on concepts and tools learned in Part II
nrnm-UiTM Perak
Table 6.7 - Data-Modeling Checklist
Table 6.7 - Data-Modeling Checklist
(cont’d.)
Table 6.7 - Data-Modeling Checklist
(cont’d.)
Table 6.7 - Data-Modeling Checklist
(cont’d.)
Summary
47
Normalization minimizes data redundancies
First three normal forms (1NF, 2NF, and 3NF) are
most commonly encountered
Table is in 1NF when:
All key attributes are defined
All remaining attributes are dependent on primary key
nrnm-UiTM Perak
Summary (cont’d.)
48
Table is in 2NF when it is in 1NF and contains no partial
dependencies
Table is in 3NF when it is in 2NF and contains no transitive
dependencies
Table that is not in 3NF may be split into new tables until
all of the tables meet 3NF requirements
Normalization is important part—but only part—of the
design process
nrnm-UiTM Perak
Summary (cont’d.)
49
Table in 3NF may contain multivalued dependencies
Numerous null values or redundant data
Convert 3NF table to 4NF by:
Splitting table to remove multivalued dependencies
Tables are sometimes denormalized to yield less
I/O, which increases processing speed
nrnm-UiTM Perak
50
1. Given the dependency diagram shown, answer:
a. Identify and discuss each of the indicated dependencies
b. Create a database whose tables are at least in 3NF,
showing the dependency diagrams for each table
nrnm-UiTM Perak
51
The dependency diagram in Figure indicates that authors are paid royalties for
each book that they write for a publisher. The amount of the royalty can vary by
author, by book, and by edition of the book.
a. Create a database whose tables are at least in 3NF, showing the dependency
diagram for each table.
nrnm-UiTM Perak
52
nrnm-UiTM Perak
53
nrnm-UiTM Perak
Exercise
54
(Mac 2013)
nrnm-UiTM Perak
Assignment
55
Pizza SunShine has come up with the following relation to
keep track of customers who order pizza:
PIZZA _ORDER (Cust_Number, Cust_Name, Cust_Phone,
{Pizza_Code, Pizza_Name, Pizza_Size, Pizza_ Price, Date,
Time, Quantity})
a. In what normal form the relational schema is in? Give
your reasons. (2 marks)
b. Identify appropriate primary key(s) (2 marks)
c. Normalize the table into 3NF. Show all your work for
each normalization stage. ( 6 marks)
d) Draw ERD nrnm-UiTM Perak (5 marks)
56
nrnm-UiTM Perak
a. 0NF, contain repeating grp and PK not define yet
b. Primary key : Subject_Code + Birth Certificate Num
57 c.
1NF:
RESULT(Teacher_Code,Teacher_Name,Subject_Code,Subject_Name,
BirthCertificateNo,StudentName,Class_Code, Class,Mark)
2NF
SUBJECT (Subject_Code,Subject_Name,Teacher_Code,Teacher_Name,
STUDENT (BirthCertificateNo,StudentName,Class_Code, Class)
RESULT (Subject_Code, BirthCertificateNo ,Mark)
3NF
SUBJECT (Subject_Code,Subject_Name,Teacher_Code*)
STUDENT (BirthCertificateNo,StudentName,Class_Code*)
RESULT (Subject_Code*, BirthCertificateNo*,Mark)
CLASS (Class_Code,Class)
TEACHER (Teacher_Code,Teacher_Name)
nrnm-UiTM Perak
58
SUBJECT produce RESULT STUDENT
PK Subject_Code get
PK,FK2 Subject_Code PK BirthCertificateNo
PK,FK1 BirthCertificateNo
Subject_Name StudentName
FK1 Teacher_Code Mark FK1 Class_Code
Teach
take
TEACHER
CLASS
PK Teacher_Code
PK Class_Code
Teacher_Name
Class
nrnm-UiTM Perak