1.Normalization.
1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Normalization is the process of organizing the data in the database.
Normalization is used to minimize the redundancy from a relation or set of relations.
It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
There are various level of normalization. These are some of them:
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
5. Fourth Normal Form (4NF)
6. Fifth Normal Form (5NF)
First Normal Form (1NF):
If a relation contains a composite or multi-valued attribute, it violates the first
normal form, or the relation is in first normal form if it does not contain any
composite or multi-valued attribute.
A relation is in first normal form if every attribute in that relation is singled valued
attribute.
A table is in 1 NF if:
1. There are only Single Valued Attributes.
2. Attribute Domain does not change.
3. There is a unique name for every Attribute/Column.
4. The order in which data is stored does not matter.
Consider the examples given below:
Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute
STUD_PHONE. Its decomposition into 1NF has been shown in table 2.
Second Normal Form (2NF):
Second Normal Form (2NF) is based on the concept of full functional dependency.
To be in second normal form, a relation must be in first normal form and relation
must not contain any partial dependency.
A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute
(attributes which are not part of any candidate key) is dependent on any proper
subset of any candidate key of the table.
Example: Let's assume, a school can store the data of teachers and the subjects they
teach. In a school, a teacher can teach more than one subject.
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID
which is a proper subset of a candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables:
Third Normal Form (2NF):
A relation is in third normal form, if there is no transitive dependency for non-prime
attributes as well as it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in every non-
trivial function dependency X –> Y:
1. X is a super key.(candidate key)
2. Y is a prime attribute (each element of Y is part of some candidate key).
Example:
In relation STUDENT given in Table 4,
FD set:
{STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE, STUD_STATE ->
STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key:
{STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true.
So, STUD_COUNTRY is transitively dependent on STUD_NO.
It violates the third normal form. To convert it in third normal form, we will
decompose the relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_COUNTRY_STUD_AGE) as:
1. STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_AGE)
2. STATE_COUNTRY (STATE, COUNTRY)
4.Boyce Codd normal form (BCNF)
Boyce Codd Normal Form is an advanced form of the third natural form and hence is
quite stricter than it.
If every functional dependency is in the form X → Y, the table is in BCNF. Here, X is
the super key to the table.
For a table to be in BCNF, it should be in 3NF. For every FD, LHS is the super key.
For example, let us consider a company which has employees in more than one
department.
In this table, functional dependencies are:
EMP_ID →EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
This table is not in BCNF because EMP_DEPT or EMP_ID are not alone keys.To convert to
BCNF, we break it down into three tables.
Here the functional dependencies are:
EMP_ID →EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
This is in BCNF because the left side of the two functional dependencies is a key.
5.Fourth normal form (4NF)
A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
For a dependency A → B, if for a single value of A, multiple values of B exists, then
the relation will be a multi-valued dependency.
Example:
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entities. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So, there is a
multi-valued dependency on STU_ID, which leads to unnecessary repetition of data.
So, to make the above table into 4NF, we can decompose it into two tables:
NORMAL FORMS CONDITION
1NF No multivalued attribute
Only single valued
2NF 1NF
No partial dependency
Only full dependency
3NF 2NF
No transitivity dependency
No non-prime should determine
non-prime
BCNF 3NF
LHS must be candidate key or
super key
4NF BCNF
No Multivalued dependency
5NF 4NF
Lossless Decomposition
2.
Inference Rule (IR)(ARMSTRONG):
The Functional dependency has 6 types of inference rule:
1. Reflexive Rule (IR1)
In the reflexive rule, if Y is a subset of X, then X determines Y.
If X ⊇ Y then X→Y
2. Augmentation Rule (IR2)
The augmentation is also called as a partial dependency. In augmentation, if X
determines Y, then XZ determines YZ for any Z.
If X → Y then XZ → YZ
3. Transitive Rule (IR3)
In the transitive rule, if X determines Y and Y determine Z, then X must also
determine Z.
If X → Y and Y→ Z then X → Z
4. Union Rule (IR4)
Union rule says, if X determines Y and X determines Z, then X must also determine Y
and Z.
If X → Y and X → Z then X → YZ
5. Decomposition Rule (IR5)
Decomposition rule is also known as project rule. It is the reverse of union rule.
This Rule says, if X determines Y and Z, then X determines Y and X determines Z
separately.
If X → YZ then X → Y and X → Z
6. Pseudo transitive Rule (IR6)
In Pseudo transitive Rule, if X determines Y and YZ determines W, then XZ
determines W.
If X → Y and YZ → W then XZ → W
3.Finding a Minimal cover F for a set of functional dependencies E