Chapter 5
Normalization of
Database Tables
BY PN.SRI YUSMAWATI MOHD
YUNUS
FSKM UiTMCNS
Learning Objectives
l e a r n
u wi l l
Yo
What normalization is and what role it plays in the database design
process
About the normal forms 1NF, 2NF, 3NF, BCNF,
and 4NF
How normal forms can be transformed from lower normal
forms to higher normal forms
That normalization and ER modeling are used concurrently to
produce a good database design
That some situations require denormalization to generate
information efficiently
Normalization
Evaluating and correcting table structures to minimize data redundancies
Reduces data anomalies
Assigns attributes to tables based on determination
Normal forms
First normal form (1NF) Second normal form (2NF) Third normal form (3NF)
Structural point of view of normal forms
Higher normal forms are better than lower normal forms
Properly designed 3NF structures meet the requirement of fourth normal
form (4NF)
Denormalization: Produces a lower normal form
Results in increased performance and greater data redundancy
?
Used while designing a new database structure
• Analyzes the relationship among the attributes within each
entity
• Determines if the structure can be improved
Improves the existing data structure and creates an
appropriate database design
Objective:
To ensure that each table conforms to the concept of
well-formed relations
Normalization
Process
Each table represents a single subject
No data item will be unnecessarily stored in more than one table
All nonprime attributes in a table are dependent on the primary key
Each table is void of insertion, update, and deletion anomalies
Ensures that all tables are in at least 3NF
Higher forms are not likely to be encountered in business environment
Works one relation at a time
Starts by:
Identifying the dependencies of a relation Progressively breaking the relation into new set
(table) of relations
Normal Forms
Functional
Dependence Concepts
Concept Definition
Functional dependence The attribute B is fully functionally dependent on the
attribute A if each value of A determines one and only one
value of B.
Functional dependence Attribute A determines attribute B if all of the rows in the
(Generalized definition) table that agree in value for attribute A also agree in value
for attribute B.
Fully functional dependence If attribute B is functionally dependent on a composite
(composite key) key A but not on any
Subset of that composite key, the attribute B is fully
functionally dependent on A.
Types of Functional
Dependencies
Partial dependency: Functional
dependence in which the determinant
is only part of the primary key
• Assumption - One candidate key
• Straight forward
• Easy to identify
Transitive dependency: An attribute
functionally depends on another
nonkey attribute
Conversion to First
Normal Form
Repeating group: Group of multiple entries of same type can exist for any single key attribute occurrence
Enable reducing data redundancies
Steps
Eliminate the repeating groups Identify the primary key Identify all dependencies
Helps to get anDependency
overview ofdiagram: Depicts among
all relationships all dependencies
Makesfound within
it less likelygiven table
that an structuredependency will
important
table’s attributes be overlooked
1NF describes
There tabular format
are no repeating in in
groups which:
the All attributes are dependent on the
All key attributes are defined
table primary key
All relational tables satisfy 1NF requirements
Some tables contain partial dependencies
Subject to data redundancies and various anomalies
Conversion to Second
Normal Form
Steps Table is in 2NF when it:
Make new tables to eliminate
Is in 1NF
partial dependencies
Reassign corresponding Includes no partial
dependent attributes dependencies
Conversion to Third
Normal Form
Steps Table is in 3NF when it:
Make new tables to eliminate transitive
Is in 2NF
dependencies
Reassign corresponding dependent
Contains no transitive dependencies
attributes
Requirements for Good
Normalized Set of Tables
Evaluate PK assignments and naming conventions
Refine attribute atomicity
• Atomic attribute: Cannot be further subdivided
• Atomicity: Characteristic of an atomic attribute
Identify new attributes and new relationships
Refine primary keys as required for data granularity
• Granularity: Level of detail represented by the values stored in a
table’s row
Maintain historical accuracy and evaluate using
derived attributes
The Completed Database
1
2
3
4
Surrogate Keys
Used by designers when the primary key is
considered to be unsuitable
System-defined attribute
Created an managed via the DBMS
Have a numeric value which is automatically
incremented for each new row
The Boyce-Codd
Normal Form (BCNF)
Every determinant in the table should be a candidate key
• Candidate key - Same characteristics as primary key but not chosen to be the primary
key
Equivalent to 3NF when the table contains only one
candidate key
Violated only when the table contains more than one
candidate key
Considered to be a special case of 3NF
A Table That is in 3NF and not in BCNF
Sample Data for a BCNF Conversion
Decomposition
to BCNF
Forth Normal Form
(4NF)
Table is in 4NF when it:
• Is in 3NF
• Has no multivalued dependencies
Rules
• All attributes must be dependent on the primary key, but they must be
independent of each other
• No row may contain two or more multivalued facts about an entity 4NF
Tables with Multivalued Dependencies
A Set of Tables in 4NF
Normalization and
Database Design
Principles and
normalization
procedures to be
Proposed entities understood to
Normalization should must meet required redesign and modify
be part of the design the normal form databases
process before table • ERD is created through an
structures are created iterative process
• Normalization focuses on
the characteristics of
specific entities
Initial ERD
Modified ERD
Incorrect M:N
Final ERD
The Implemented Database
Denormalization
Design goals
Creation of normalized relations Processing requirements and speed
Number of database tables expands when tables are decomposed to conform to
normalization requirements
Joining a larger number of tables:
Takes additional input/output (I/O) operations
Reduces system speed
and processing logic
Defects in unnormalized tables
No simple strategies for
Data updates are less efficient
Indexing is more cumbersome creating virtual tables known as
because tables are larger
views
Data-Modeling Checklist