University of Sulaymaniyah
College of basic education
Computer Science department
3nd Stage
Comprehensive Report
on Database Normalization
Prepared By:
Osan Salah Jamil
Zhwan Othman Majid
Supervised By:
T. Payman rahem
(2024 – 2025)
Table of Contents
Table of Contents ................................................................................................ 1
Introduction ......................................................................................................... 3
Key Principles of Normalization ........................................................................... 4
1. Atomicity:.................................................................................................. 4
2. Reduction of Redundancy: ....................................................................... 4
3. Data Integrity: ........................................................................................... 4
Types of Anomalies Addressed ........................................................................... 4
1. Insertion Anomaly:.................................................................................... 4
2. Update Anomaly: ...................................................................................... 4
3. Deletion Anomaly: .................................................................................... 4
Normalization Forms ........................................................................................... 5
First Normal Form (1NF) ................................................................................... 5
Second Normal Form (2NF).............................................................................. 5
Third Normal Form (3NF).................................................................................. 5
Boyce-Codd Normal Form (BCNF) ................................................................... 5
Fourth and Fifth Normal Forms (4NF, 5NF) ...................................................... 5
Real-World Applications ...................................................................................... 6
1. Banking Systems...................................................................................... 6
2. E-Commerce ............................................................................................ 6
3. Healthcare ................................................................................................ 6
Advantages and Challenges ................................................................................ 7
Advantages....................................................................................................... 7
• Storage Efficiency .................................................................................... 7
1
• Improved Query Performance .................................................................. 7
• Data Integrity ............................................................................................ 7
Challenges ....................................................................................................... 7
• Complexity ............................................................................................... 7
• Performance Trade-offs............................................................................ 7
Conclusion .......................................................................................................... 8
References .......................................................................................................... 9
2
Introduction
Database normalization is a fundamental concept in relational database design,
introduced by Edgar Codd in the 1970s. It aims to reduce data redundancy,
prevent data anomalies, and ensure the consistency and integrity of the data.
For an in-depth understanding, reference books such as Database System
Concepts by Abraham Silberschatz, An Introduction to Database Systems by C.J.
Date, and Fundamentals of Database Systems by Ramez Elmasri and Shamkant
B. Navathe offer extensive insights.
3
Key Principles of Normalization
Normalization works based on a few key principles:
1. Atomicity: Ensures each column contains indivisible values (i.e., no
repeating groups).
2. Reduction of Redundancy: Prevents storing the same data multiple
times.
3. Data Integrity: Ensures consistency and correctness of data through
constraints and relationships.
These principles are closely tied to the concept of functional dependencies,
which are explained in Fundamentals of Database Systems by Elmasri and
Navathe.
Types of Anomalies Addressed
Normalization addresses three primary types of anomalies, all of which stem from
redundancy in data storage:
1. Insertion Anomaly: Occurs when certain attributes can't be added
without adding unrelated data. For instance, adding a new employee
requires adding an associated department, even if the department doesn't
change.
2. Update Anomaly: Inconsistent updates due to duplicated data. If a
department name is stored in multiple rows, an update to one row might not
propagate to all rows, leading to discrepancies.
3. Deletion Anomaly: Loss of useful data during deletion. For example, if
an employee is the only person in a department, deleting the employee
record would result in the loss of the department’s information.
Redundancy—the unnecessary repetition of data—underlies all these anomalies.
By eliminating redundancy through normalization, these issues are minimized,
and data integrity is preserved.
4
Repeating Groups and Atomicity
In First Normal Form (1NF), one of the main goals is to eliminate repeating
groups. A repeating group occurs when multiple values for a single attribute
(column) are stored in one record. This violates the principle of atomicity, which
dictates that each column should hold a single, indivisible value.
Example: Consider a table where each student can have multiple subjects.
Storing all subjects in one cell (e.g., "Math, Science, English") violates 1NF.
Instead, we split this into separate rows for each subject, maintaining atomicity
and ensuring the data is properly structured.
Normalization Forms
First Normal Form (1NF)
• Ensures that every field contains atomic data (no multiple values or
repeating groups).
o Example: A table containing multiple subjects per student should be
split into separate rows for each subject.
Second Normal Form (2NF)
• Removes partial dependencies, where non-key attributes depend on only
part of a composite key.
o Example: A table that combines employee and project details should
be split into two tables.
Third Normal Form (3NF)
• Removes transitive dependencies, where non-key attributes depend on
other non-key attributes.
o Example: Splitting a department location from employee data.
Boyce-Codd Normal Form (BCNF)
• A stricter version of 3NF, ensuring that all determinants are candidate keys.
o Example: Handling overlapping candidate keys in complex data.
Fourth and Fifth Normal Forms (4NF, 5NF)
• Deal with multi-valued and join dependencies, respectively, ensuring even
more refined data structures.
5
Functional Dependency
Functional dependency refers to a relationship where one attribute (or set of
attributes) uniquely determines another. For example, in a table where each
employee has a unique ID, the employee’s ID determines their name, position,
and salary.
• In 2NF, we ensure that non-key attributes depend on the entire primary key.
If a non-key attribute depends only on part of a composite key, it violates
2NF.
• In 3NF, we remove transitive dependencies, where non-key attributes
depend on other non-key attributes.
Understanding and properly applying functional dependencies is essential for
breaking down tables in normalization and ensuring that data dependencies are
logically and efficiently represented.
Real-World Applications
Normalization is crucial in various industries:
1. Banking Systems: Prevents redundancy between account details,
transactions, and customer information.
2. E-Commerce: Reduces repetition in product catalogs and order details.
3. Healthcare: Ensures atomic records for patients and their medical
histories.
6
Advantages and Challenges
Advantages
• Storage Efficiency: Reduces unnecessary data storage by eliminating
redundancy.
• Improved Query Performance: Simplifies data retrieval, especially in large
databases.
• Data Integrity: Prevents errors caused by inconsistent data.
Challenges
• Complexity: Over-normalization can create a highly fragmented database
structure.
• Performance Trade-offs: Denormalization might be needed for query
performance at scale, which can reintroduce redundancy.
7
Conclusion
Database normalization is essential for organizing relational databases by
minimizing redundancy, preventing anomalies, and ensuring data integrity. By
applying normalization forms such as 1NF, 2NF, 3NF, and BCNF, we optimize the
database structure, which is crucial for the smooth functioning of systems in
sectors like banking, e-commerce, and healthcare.
However, normalization also comes with challenges, particularly in the form of
over-normalization, which can complicate queries and affect performance. While
normalization is fundamental, denormalization might be necessary in specific
scenarios where speed and query performance are more critical than minimizing
redundancy.
As database systems evolve, newer models like NoSQL and NewSQL challenge
traditional normalization practices, but understanding these techniques remains
key for database administrators and developers. Ultimately, normalization ensures
efficient, consistent, and reliable data storage, making it a cornerstone of modern
database design.
8
References
1. Database System Concepts by Abraham Silberschatz, Henry Korth, and S.
Sudarshan. This book provides comprehensive coverage of relational
database theory and includes extensive examples on normalization.
2. An Introduction to Database Systems by C.J. Date. Date's book is
considered a classic in the field of databases and covers the theory and
practice of normalization in great detail.
3. Fundamentals of Database Systems by Ramez Elmasri and Shamkant B.
Navathe. This textbook provides a detailed treatment of database design
principles and is widely used in academic courses on databases.
4. Database Systems: A Practical Approach to Design, Implementation, and
Management by Thomas Connolly and Carolyn Begg. This book includes
practical approaches to database normalization and real-world
implementation advice.
5. Online resources such as DataCamp’s guide to SQL normalization and
Guru99's normalization tutorial also provide hands-on examples and further
reading.