Normalization
Example
Normalization - Student Example The following table depicts the set of attributes found in a University database:
Student -No 001 StudentName Smith CourseCode A203 CourseLength (yrs) 3 UnitCode U45 U87 U86 U45 U25 U12 U46 U12 U86
Unit-Name
Databases II Programming Algorithms Databases II Business I Business II Databases I Business II Algorithms
Lecturer
Brown Green Purple Brown Red Pink Orange Pink Purple
003
Soap
A104
007
Who
A203
010
Lemon
A323
Notes: A student attends one course and can take any units during the course. A unit may be presented as part of any course and is always given by one particular lecturer. You are required to show the first, second and third normal forms.
First Normal Form
"Remove repeating groups" The attribute chosen as the primary key is student-no. The set of attributes which repeat for each value of student-no are unit-code, unit-name, and lecturer. Removing these attributes from the full attribute set produces the relation: (student-no, student-name, coursecode, course-length)
Student -No 001 001 003 003 003 007 007 010 010
StudentName Smith Smith Soap Soap Soap Who Who Lemon Lemon
CourseCode A203 A203 A104 A104 A104 A203 A203 A323 A323
CourseLength (yrs)
3 3 4 4 4 3 3 2 2
UnitCode U45 U87 U86 U45 U25 U12 U46 U12 U86
Unit-Name Databases II Programming Algorithms Databases II Business I Business II Databases I Business II Algorithms
Lecturer Brown Green Purple Brown Red Pink Orange Pink Purple
BACK
First Normal Form
The primary key is student-no. (student-no, student-name, coursecode, course-length) The primary key for the repeating group is unit-code. This is because for each student-no the unit-code uniquely identifies the unit-name and lecturer attributes. For example, for student 001, unit U45 is always 'Databases II' and the lecturer is always 'Brown'. The new relation is: (unit-code, unit-name, lecturer)
First Normal Form
There are no more repeating groups. The first normal form DD:
Second Normal Form
"Remove partial dependencies - attributes rely on the whole key "
unit-code -> unit-name, lecturer unit-code -> unit-name The primary key determines all attributes. Each unit has a name. Each unit is taught by the same lecturer.
unit-code -> lecturer
Second Normal Form
Therefore, a partial dependency exists between unit-code and unit-name and lecturer. Removing the partial dependencies from student (but not changing the key of student) produces the relations: student(student-no, student-name, course-code, course-length) unit(unit-code, unit-name, lecturer)
Second Normal Form
The second normal form DD:
Third Normal Form
"Remove transitive dependencies - attributes rely on nothing but the key" The student relation has the following functional dependencies:
student-no -> student-name, The primary key determines all course-code, course-length attributes.
course-code -> course-length
The course length is determined by the course.
Third Normal Form
Therefore, the following transitive dependency exists: student-no -> course-code -> course-length Removing this transitive dependency from student produces the following relations: student(student-no, studentname, course-code) course(course-code, courselength)
Third Normal Form
The unit relation contains no non-key attributes and so contains no transitive dependencies. The unit relation contains the following dependencies:
unit-code -> unit-name, The primary key determines all lecturer attributes.
Third Normal Form
Therefore, there are no transitive dependencies in unit. The set of third normal form relations are: student(student-no, studentname, course-code) course(course-code, courselength) unit(unit-code, unit-name, lecturer)
Third Normal Form
The set of third normal form DD:
So, to begin the normalization process, we start by moving from zero normal form to 1st normal form.
The definition of 1st normal form there are no repeating groups all the key attributes are defined
all attributes are dependent on the primary key
A table is in 2nd normal form if
it's in 1st normal form
it includes no partial dependencies (where an attribute is dependent on only a part of a primary key).
The definition of 3rd normal form
It's in 2nd normal form
It contains no transitive dependencies (where a non-key attribute is dependent on another nonkey attribute).
There are other normal forms Boyce-Codd normal form, and 4th normal form, but these are very rarely used for business applications. In most cases, tables in 3rd normal form are already in these normal forms anyway.