Topic: Algorithms and Normalization
1. Introduction to Normalization
Definition and Purpose
Normalization is a systematic approach in database design to organize data in a way that:
Eliminates redundancy (duplicate data).
Prevents anomalies during insertion, deletion, or updates.
Improves data integrity and maintains consistency.
Key Problems in Non-Normalized Data
1. Redundancy: Repeated data increases storage requirements.
2. Insertion Anomaly: Inability to add data without existing related data.
o Example: Adding a new department without assigning an employee to it.
3. Deletion Anomaly: Removing data inadvertently deletes necessary related data.
o Example: Removing the last employee in a department deletes the department
itself.
4. Update Anomaly: Changing data inconsistently across multiple records.
o Example: Updating an instructor's email in one course but not in another.
Example Scenario:
StudentID Name Course Instructor InstructorEmail
1 Alice Math Dr. Smith smith@school.edu
2 Bob Math Dr. Smith smith@school.edu
Issues:
Redundancy: Instructor details are repeated.
Update anomaly: Changing the email requires multiple updates.
2. Steps of Normalization (2 Hours)
First Normal Form (1NF)
Definition:
A table is in 1NF if:
1. Data is atomic (indivisible).
2. Each column contains values of a single type.
3. Each row is unique, often identified by a primary key.
Steps to Achieve 1NF:
1. Identify repeating groups or multi-valued attributes.
2. Break down the table into rows with atomic values.
Example:
StudentI Name Subjects
D
1 Alice Math, Physics
Solution (1NF):
StudentI Name Subject
D
1 Alice Math
1 Alice Physics
Second Normal Form (2NF)
Definition:
A table is in 2NF if:
1. It is in 1NF.
2. All non-primary key columns depend on the entire primary key.
(Eliminates partial dependency).
Steps to Achieve 2NF:
1. Identify composite primary keys (keys composed of more than one column).
2. Ensure every non-key attribute is fully dependent on the composite key.
3. Separate partially dependent attributes into new tables.
Example:
| OrderID | ProductID | ProductName | Quantity | Price |
Problem:
ProductName and Price only depend on ProductID.
Partial dependency exists.
Solution (2NF):
OrderID ProductID Quantity
ProductID ProductName Price
Third Normal Form (3NF)
Definition:
A table is in 3NF if:
1. It is in 2NF.
2. There are no transitive dependencies (non-key attributes depending on other non-
key attributes).
Steps to Achieve 3NF:
1. Identify attributes dependent on non-key attributes.
2. Create separate tables for independent data.
Example:
| StudentID | Course | Instructor | InstructorEmail |
Problem:
InstructorEmail depends on Instructor, not directly on StudentID.
Solution (3NF):
StudentI Course Instructor
D
| Instructor | InstructorEmail |
3. Algorithms for Normalization (30 minutes)
Algorithm for 1NF:
1. Identify multi-valued columns.
2. Break down into atomic values.
3. Ensure each row has a unique identifier.
Algorithm for 2NF:
1. Ensure the table is in 1NF.
2. Identify partial dependencies (non-key attributes dependent on part of a composite key).
3. Create new tables for attributes partially dependent on composite keys.
Algorithm for 3NF:
1. Ensure the table is in 2NF.
2. Identify transitive dependencies (non-key attributes dependent on other non-key attributes).
3. Create new tables for attributes causing transitive dependency.
4. Practice and Activities (30 minutes)
Activity 1: Practice Normalizing a Database
Unnormalized Table:
EmpI Name Dept Project Hours
D
101 Alice HR Payroll 20
101 Alice HR Recruitment 15
102 Bob IT Developmen 30
t
Tasks:
1. Convert the table to 1NF.
2. Normalize to 2NF.
3. Normalize to 3NF.
Expected Solution:
1NF: Separate multi-valued columns into rows.
2NF: Split tables based on partial dependencies (e.g., Dept-Project).
3NF: Separate transitive dependencies (e.g., Dept details).
Activity 2: Quiz: Identify Non-Normalized Data
Present small tables with anomalies.
Ask students to:
o Identify issues (redundancy, anomalies).
o Suggest steps for normalization.
5. Summary and Wrap-Up (10 minutes)
Recap:
o Benefits of normalization: reduced redundancy, improved integrity, and consistency.
o Key steps and forms of normalization (1NF → 3NF).
Homework: Normalize a provided dataset to 3NF and write steps taken.