Database Normalization
(FUNCTIONAL DEPEDENCY & DML ANOMALIES)
      What is Normalization?
   Normalization is the process of organizing the data in the database.
   Normalization is used to minimize the redundancy from a relation or set of
    relations. It is also used to eliminate the undesirable characteristics like Insertion,
    Update and Deletion Anomalies.
   Normalization divides the larger table into the smaller table and links them using
    relationship.
   The normal form is used to reduce redundancy from the database table.
    Functional Dependency
   Functional dependency (FD) is a set of constraints between two attributes in a relation.
    Functional dependency says that if two tuples have same values for attributes A1, A2,...,
    An, then those two tuples must have to have same values for attributes B1, B2, ..., Bn.
   Functional dependency is represented by an arrow sign (→) that is, X→Y, where X
    functionally determines Y or You can say that Y is functionally dependent of X.
   The left-hand side attributes determine the values of attributes on the right-hand side.
     Armstrong's Axioms
     (Functional Dependency Rules)
   Reflexive rule − If alpha is a set of attributes and beta is subset of
    alpha, then alpha holds beta.
   Augmentation rule − If a → b holds and y is attribute set, then ay
    → by also holds. That is adding attributes in dependencies, does not
    change the basic dependencies.
   Transitivity rule − Same as transitive rule in algebra, if a → b holds
    and b → c holds, then a → c also holds. a → b is called as a
    functionally that determines b.
      Types of Functional Dependecy
   Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of
    X, then it is called a trivial FD. Trivial FDs always hold.
   Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is
    called a non-trivial FD.
   Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is
    said to be a completely non-trivial FD.
     Need of Normalization
   Normalization is the process of minimizing redundancy from a
    relation or set of relations. Redundancy in relation may cause
    insertion, deletion and updation anomalies.
   So, it helps to minimize the redundancy in relations. Normal
    forms are used to eliminate or reduce redundancy in database
    tables.
      Anomalies….
   Update anomalies − If data items are scattered and are not linked to each other
    properly, then it could lead to strange situations. For example, when we try to
    update one data item having its copies scattered over several places, a few
    instances get updated properly while a few others are left with old values. Such
    instances leave the database in an inconsistent state.
   Deletion anomalies − We tried to delete a record, but parts of it was left
    undeleted because of unawareness, the data is also saved somewhere else.
   Insert anomalies − We tried to insert data in a record that does not exist at all.
                                                Emp_dep
                 Emp_id   Ename      Emp_City
                                                t
                 E1       Aditya     Pune       D1
Update anomaly
                 E1       Aditya     Pune       D2
Delete anomaly   E3       Paritosh   Mumbai     D4
                 E9       Priya      Delhi      D6
Insert anomaly   E9       Priya      Delhi      D11
                 E10      Mehak      Mumbai     D2
Normal Form   Description
1NF           A relation is in 1NF if it contains an atomic value.
2NF           A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
              dependent on the primary key.
3NF           A relation will be in 3NF if it is in 2NF and no transition dependency exists.
              (Transitive functional dependency of non-prime attribute on any super key should
              be removed.)
BCNF          A relation will be in BCNF if it is in 3NF and for every functional dependency X->Y,
              X should be the super key of the table.
4NF           A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
              dependency.
5NF           A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
              should be lossless.
1st Normal Form
An attribute
(column) of a table
cannot hold
multiple values. It
should hold only
atomic (Single)
values.