DATABASE NORMALIZATION
Database Normalization is a technique of organizing the data in the database.
Normalization is a systematic approach of decomposing tables to eliminate data redundancy
and undesirable characteristics such as insertion, update and deletion anomalies. It is a
multi-step process that puts data into tabular form by removing duplicated data from the
relation tables.
Normalization is used mainly for two purpose: Eliminating Redundant data and Ensuring
data dependencies i.e data is logically stored.
PROBLEM WITHOUT NORMALIZATION
If a table is not properly normalized and has data redundancy then it will not only eat up
extra memory space but will also make it difficult to handle and update the database,
without facing data loss. Insertion, Updating and Deletion Anomalies are very frequent if
database is not normalized.
NORMALIZATION RULES
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce- Codd Normal Form (4NF)
FIRST NORMAL FORM
The inventor of the relational model Edgar Codd proposed the theory of database
normalization to reduce data redundancy and improve data integrity and consistency. With
the introduction of first normal form, he continued to extend the theory with second and
third normal form. Later on, Raymond Boyce also joined to develop the theory of Boyce-
Codd Normal form.
In other to understand this normalization rules, the following Student table will be used.
Assuming a Student table is created to store information or records of student, without any
normalization, all information is stored in one table as shown below:
StudNames StudAddress CoursesOffered Salutation CourseStatus
Janet Jackson Tanke CSC 717, CSC Miss. Core, Optional
718
Ifeoluwa Daniel GRA CSC 719, CSC Mr. Optional, Core
720
Ifeoluwa Daniel Oke-Odo CSC 718 Mr. Optional
In this table, courses offered has multiple values. In other for this table to be in first normal
form, the rules of first normal form must be applied.
First normal form rules
- Each table cell should contain a single value
- Each record needs to be unique.
Following these rules, the table becomes:
StudNames StudAddress CoursesOffered Salutation CourseStatus
Janet Jackson Tanke CSC 717 Miss. Core
Janet Jackson Tanke CSC 718 Miss. Optional
Ifeoluwa Daniel GRA CSC 719 Mr. Optional
Ifeoluwa Daniel GRA CSC 720 Mr. Core
Ifeoluwa Daniel Oke-Odo CSC 718 Mr. Optional
Note: This Student table above is now in first normal form
Primary Key: A primary key is a single column value used to identify a database record
uniquely.
It has following attributes
A primary key cannot be NULL
A primary key value must be unique
The primary key values cannot be changed
The primary key must be given a value when a new record is inserted.
Composite Key: Is a primary key that is composed of multiple columns used to identify a
record uniquely. In the Student table above, there are two records with the same names but
different address, therefore both Names and Address will be required to identify a record
uniquely. This is a composite key.
Foreign Key: It references the primary key of another Table. It helps to connect Tables in a
database. Its attributes are:
A foreign key can have a different name from its primary key
It can only have values present in the primary key
It ensures rows in one table have corresponding rows in another
Unlike the Primary key, they do not have to be unique. Most often they aren't
Foreign keys can be null even though primary keys can not
Second Normal Form Rules
- It must be in first normal form
- Have single column primary key
StudNum StudNames StudAddress Salutation
001 Janet Jackson Tanke Miss.
002 Ifeoluwa Daniel GRA Mr.
003 Ifeoluwa Daniel Okeodo Mr.
Table 1
StudNum CoursesOffered
001 CSC 717
001 CSC 718
002 CSC 719
002 CSC720
003 CSC 718
Table 2
The first normal table has been divided into two tables. Table 1 contains Student
information and Table 2 contains information on courses offered. A new column called
StudNum has been introduced which is the primary key for table 1. Therefore, records can
be uniquely identified.
StudNum in table 2 serve as a foreign key to the StudNum in Table 1.
Transitive Functional dependency: This occurs when changing a non- key column might
cause any of the other non-key columns to change. For instance in table 1, changing the
non-key column StudNames may change the salutation
Third normal form
- It must be in second normal form
- Has no transitive functional dependencies
StudNum StudNames StudAddress Salutation
001 Janet Jackson Tanke 2
002 Ifeoluwa Daniel GRA 1
003 Ifeoluwa Daniel Okeodo 1
Table 1
StudNum CoursesOffered
001 CSC 717
001 CSC 718
002 CSC 719
002 CSC720
003 CSC 718
Table 2
SalutationNum Salutations
1 Mr.
2 Miss.
3 Mrs.
Table 3
The table has been divided again and created a new table which stores Salutations.
There are no transitive functional dependencies, and hence the table is in 3NF. In Table 3
Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary key in Table 3.