KEMBAR78
Dbms Unit-III Dbms Normal Forms Notes | PDF | Information Science | Data Management
0% found this document useful (0 votes)
31 views13 pages

Dbms Unit-III Dbms Normal Forms Notes

The document discusses schema refinement and normalization in database design, emphasizing its purpose to minimize data redundancy and avoid anomalies like insertion, update, and deletion. It explains functional dependencies and various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) that help ensure efficient database design. Key concepts include surrogate keys, lossless joins, and the importance of understanding data semantics for effective normalization.

Uploaded by

nihanthkante
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views13 pages

Dbms Unit-III Dbms Normal Forms Notes

The document discusses schema refinement and normalization in database design, emphasizing its purpose to minimize data redundancy and avoid anomalies like insertion, update, and deletion. It explains functional dependencies and various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) that help ensure efficient database design. Key concepts include surrogate keys, lossless joins, and the importance of understanding data semantics for effective normalization.

Uploaded by

nihanthkante
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

UNIT-III

Schema Refinement Normalization


I. Purpose of Normalization or Schema Refinement
II. Concept of functional dependency
III. Normal forms based on functional dependency (1NF, 2NF and 3 NF, Boyce-Codd normal form (BCNF), 4NF)
IV. Concept of Surrogate Key,
V. Lossless join and Dependency preserving decomposition

I. Purpose of Normalization or Schema Refinement

Introduction:
 Normalization is an essential part of database design.
 Normalization is also known as “Schema Refinement” in database design.
 The Schema Refinement divides larger tables to smaller tables and links them using relationships.
 Normalization is a “process of organizing the data in database to avoid data redundancy, insertion
anomaly, updateanomaly & deletion anomaly”.
(Or)
Normalization means “split the tables into small tables which will contain less number of attributes in such a way that
table design must not contain any problem of inserting, deleting, updating anomalies and guarantees no redundancy”.
 A good understating of the semantics of data helps the designer to build efficient design using the concept of
normalization.
 Problems Without Normalization:
If tables not properly normalized and have data redundancy then it will not only eat up extra memory space but will
also make it difficult to handle and update the database, without facing data loss. Insertion, Updating and
Deletion Anomalies are very frequent if database is not normalized. To understand these anomalies let us take
an example of a Student table.
Rollno Name Branch HOD OfficeTel
401 Akon CSE Mr. X 53337
402 Bkon CSE Mr. X 53337
403 Ckon CSE Mr. X 53337
404 Dkon CSE Mr. X 53337
o From the table above, we have data of 4 Computer Sci. students. As we can see, data for the field’s Branch, HOD
and OfficeTel is repeated for the students who are in the same branch in the college, this is Data Redundancy.
o Finally there are three types of Update anomalies or problems without Normalization:
1. Insertion Anomaly: Suppose for a new admission, until and unless a student opts for a branch, data of the student
cannot be inserted, or else we will have to set the branch information as NULL. This is known as Insertion anomaly.
2. Deletion Anomaly: In our Student table, two different information systems are kept together, Student information
and Branch information. Hence, at the end of the academic year, if student records are deleted, we will also lose
the branch information. This is known as Deletion anomaly.
3. Update Anomaly: What if Mr. X leaves the college? Or is no longer the HOD of computer science department? In
that case all the student records will have to be updated, and if by mistake we miss any record, it will lead to data
inconsistency. This is known as Update anomaly.

Purpose of Normalization:
 Minimize the redundancy in data.
 Remove insert, update, and delete anomalies during the database activities.
 Reduce the need to organize the data when it is modified or enhanced.
 Normalization reduces a complex user view to a set of small and sub groups of fields or relations. This process
helpsto design a logical data model known as conceptual data model.

1
II. Concept of Functional Dependency
Functional Dependency (FD):
 Functional Dependencies are fundamental to the process of Normalization i.e., Functional Dependency plays
key role indifferentiating good database design from bad database designs.
 A functional dependency is a “type of constraint that is a generalization of the notation of the key”.
 Functional Dependency describes the relationship between attributes (columns) in a table.
 Functional dependency is represented by an arrow sign (→).
 In other words, a dependency FD: “X → Y” means that the values of Y are determined by the values of X. Two
tuples sharing the same values of X will necessarily have the same values of Y. An attribute on left hand side is
known as “Determinant”. Here X is a Determinant.

Example1: Student Database


Name RollNo CGPA
A R1 7.6
B R2 5
C R3 9.2
A R4 9.1
B R5 8.7
Question1: What is the CGPA of A?
Answer: We cannot answer, because in database have 2 records with A’s. So we cannot answer for CGPA of A.
Note: But we can find the CGPA’s with respect to RollNo i.e., See the Answer for Question2.
Question2: What is the CGPA of R4?
Answer: R4→9.1
Note: Rollno → CGPA and RollNo → Name. i.e., here we are establishing the relationships. This type is known as
Functional dependency.
 Example 2: [Identifying the FD’s]
A B C D
A1 B1 C1 D1
A1 B2 C1 D2
A2 B2 C2 D2
A2 B2 C2 D3
A3 B3 C2 D4
Case1: A →B
Here A1 belongs to B1 & B2. So A1 does not have unique value in B. So it is not in FD.
Case1: A →C
Here A1→C1 and A2, A3→C2. So A has unique values in B. So it is in FD.
Note: try to find all the possibilities. i.e., A→D, B→C, B→D, and C→D
 There are Different types of Functional Dependency. They are:
1. Fully Functional Dependency: A functional dependency is said to be full dependency “if and only if the determinant
of the functional dependency if either candidate key or super key, and the dependent can be either prime or non-
prime attribute”.
(OR)
Let’s take the functional dependency X → Y (i.e., X determines y). Here Y is said to be fully determinant, if it
cannot determine any subset of X.
Example: Consider the following determinant ABC → D i.e., ABC determines D but D is not determined by any
subset of A/ BC/C/B/AB i.e., BC→D, C→D, A→D Functional dependencies are not exists. So D is Fully Functional
Dependent.

2
2. Partial Functional Dependency: If a non-prime attribute of the relation is getting derived by only a part of the
candidate key, then such dependency is known as Partial Dependency.
(OR)
In a relation having more than one key field, a subset of non key fields may depend on all key fields but
another subset or a particular non-key field may depend on only one of the key fields. Such dependency is defined
as Partial Dependency.
Example: Consider the following determinants AC→P, A→D, D→P. From these determinants P is not fully FD on
AC. Because, If we find A+ (means A’s Closure) A→D, D→P i.e., A→P. But we don’t have any requirement of C. C
attribute is removed completely. So P is Partially Dependent on AC
3. Transitive Functional Dependency: If a non-prime attribute of a relation is getting derived by either another non-
prime attribute or the combination of the part of the candidate key along with non-prime attribute, then such
dependency is defined as Transitive dependency. i.e., in a relation, there may be dependency among non-key
fields. Such dependency is called Transitive Functional Dependency.
Example: X→Y, and Y→Z then we can determine X→Z holds.
4. Trivial Functional Dependency: It is basically related to Reflexive rule. i.e., if X is a set of attributes, and Y is subset
of X then X→Y holds.
Example: ABC→BC is a Trivial Dependency.
5. Multi-Valued Dependency: Consider 3 fields X, Y, and Z in a relation. If for each value of X, there is a well-defined
set of values Y and Well-defined set of values of Z and set of values of Y is independent of the set values of Z. This
dependency is Multi-valued Dependency. i.e., X →Y / Z.
 Closure of a set of Functional Dependencies: The set of FD’s that is logically implied by F is called the closure of F and
written as F+. And it is defined as “If F is a set FD’s on a relation R, the F +, the closure of F by using the inferences
axioms that are not contained in F+.
Example: R (A, B, C, D) and set of Functional Dependencies are A→B, B→D, C→B then what is the Closure of A?
Solution: A+ is
A+→ {A, B, D} i.e., A→B, B→D is exists and C is not FD on A. So it is eliminated.
 Decomposition Rule: If X→YZ holds, then X→Y holds and X→Z holds.
 Union Rule: If X→Y holds and X→Z holds then X→YZ holds.
 Database Keys: They are used to establish and identify the relation between tables. They can also ensure that each
record within a table can be uniquely identified by combination of one or more fields with in a table.
1. Super Key: An attribute or the combination of attributes is defined as the super key if they derive all attributes of
the relation. Super key is defined as the set of attributes with in a relation that uniquely identify each record in a
table.
2. Candidate Key: An attribute or the combination of the attributes is defined as the candidate key. They derive all the
attribute of the relation. They are minimum level of subset.
3. Prime or Key Attribute: The attributes of the relation which exists as path of any possible candidate key is called
Prime or key attribute.
4. Primary key: It is a candidate key that is most appropriate to become the main key of the table. It is the key
uniquely identifies the records in a table.
5. Composite Key: A key that consists o0f two or more attributes that uniquely identifies an entity occurrence is called
composite key.
6. Secondary Key or Alternative key: The candidate keys which are not selected for primary key are known as
secondary key.
7. Foreign Key: Tuples in one relation, say r1 (R1) often need to refer tuples in another relation, say r2 (R2).

III. Normal forms based on functional dependency (1NF, 2NF and 3 NF, Boyce-Codd normal form (BCNF), 4NF)

 The inventor of the relational model “Edgar Codd” proposed the theory of normalization with the introduction of First
Normal Form, and he continued to extend theory with Second and Third Normal Form. Later he joined with Raymond F.
Boyce to develop the theory of Boyce-Codd Normal Form.
 When Normalization process gets applied on any given relation it performs the following activities:
 It finds the current normal form of the relation.
 Normalization process decomposes the relation from its current Normal Form to higher Normal Form.
3
 Normalization means “split the tables into small tables which will contain less number of attributes in such a way that table
design must not contain any problem of inserting, deleting, updating anomalies and guarantees no redundancy”.
 The evolution of Normalization theories / Steps of Normalization / Different Normal Forms is illustrated below-
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
5. Fourth Normal Form (4NF)
6. Fifth Normal Form (5NF).

Note: In practice, 1NF, 2NF, and 3NF are enough for database.
 First Normal Form (1NF): A relation is said to in the 1NF if it is already in un-normalized form and It satisfies the following
conditions or rules or qualifications are:
1. Each attribute name must be unique.
2. Each attribute value must be single or atomic i.e., Single Valued Attributes.
3. Each row / record must be unique.
4. There is no repeating group’s.
 An atomic value is a value that cannot be divided.
 A repeating group means that a table contains two or more columns that are closely related.
 Example 1: How do we bring an un-normalized table into first normal form? Consider the following example:

Solution:
This table is not in first normal form because the [Color] column can contain multiple values. For example, the first
row includes values "red" and "green." To bring this table to first normal form, we split the table into two tables and now
we have the resulting tables:

Now first normal form is satisfied, as the columns on each table all hold just one value.
 Example 2: First consider the following relation with sample data which is not in 1NF.
Roll_NO Name Subject
101 Akon OS, CN
103 Ckon JAVA
102 Bkon C, C++
o Our table already satisfies 2 rules out of the 4 rules, as all our column names are unique, we have stored data in the
order we wanted to and we have not inter-mixed different type of data in columns.
o But out of the 3 different students in our table, 2 have opted for more than 1 subject. And we have stored the
subject names in a single column. But as per the 1st Normal form each column must contain atomic value.
o Solution: It's very simple, because all we have to do is break the values into atomic values to convert the
relationinto 1NF.

4
Roll_NO Name Subject
101 Akon OS
101 Akon CN
103 Ckon JAVA
102 Bkon C
102 Bkon C++
Note: Now the above table is in 1NF.

 Second Normal Form (2NF): A relation is said to be in 2NF, if it is already in 1st NF and it has no Partial Dependency i.e., no
non-prime attribute is dependent on the only a part of the candidate key.
(OR)
A relation is in second normal form if it satisfies the following conditions:
 It is in first normal form
 All non-key attributes are fully functional dependent on the primary key.
 Note: Partial Functional Dependency: If a non-prime attribute of the relation is getting derived by only a part of the
candidate key, then such dependency is known as Partial Dependency
 In a table, if attribute B is functionally dependent on A, but is not functionally dependent on a proper subset of A, then B
is considered fully functional dependent on A. Hence, in a 2NF table, all non-key attributes cannot be dependent on a
subset of the primary key. Note that if the primary key is not a composite key, all non-key attributes are always fully
functional dependent on the primary key. A table that is in 1st normal form and contains only a single key as the primary
key is automatically in 2nd normal form.
 Example 1:Consider the following example:

o This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase Location]. In
this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key. Therefore, this
table does not satisfy second normal form.
o To bring this table to second normal form, we break the table into two tables, and now we have the following:

o What we have done is to remove the partial functional dependency that we initially had. Now, in the table
[TABLE_STORE], the column [Purchase Location] is fully dependent on the primary key of that table, which is
[Store ID].
 Example 2: Consider the following relation and check whether the given relation is in 2NF or not. If it is not in 2NF convert
into 2NF. Let's consider tables for Subject, which will have subject_id and subject_name fields and subject_id will be the
primary key and Score table
subject_id subject_ score_id student_id subject_id marks teacher
name
1 Java 1 10 1 70 Java Teacher
2 C++ 2 10 2 75 C++ Teacher
3 Php 3 11 1 80 Java Teacher
Student Table Score Table
Solution: The given relation is not in 2NF. Because, if you look at the Score table, we have a column
5
Names teacher which is only dependent on the subject, for Java it's Java Teacher and for C++ it's C++ Teacher &
So on.
 Now as we just discussed that the primary key for this table is a composition of two columns which is
student_id & subject_id but the teacher's name only depends on subject, hence the subject_id, and has
nothing to do with student_id. This is Partial Dependency, where an attribute in a table depends on only a part
of the primary key and not on the whole key.
 The simplest solution is to remove columns teacher from Score table and add it to the Subject table. Hence, the
Subject and Score table will become:
subject_id subject_name teacher
1 Java Java Teacher
2 C++ C++ Teacher
3 Php Php Teacher
&
score_id student_id subject_id marks
1 10 1 70
2 10 2 75
3 11 1 80

 Third Normal Form (3NF): A database is in third normal form if it satisfies the following conditions:
 It is in 2NF.
 There is no transitive functional dependency
 By transitive functional dependency, we mean we have the following relationships in the table: A is functionally
dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on A via B.
 Example 1: Consider the following example:

o In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type]. Therefore, [Book ID]
determines [Genre Type] via [Genre ID] and we have transitive functional dependency, and this structure does
not satisfy third normal form.
o To bring this table to third normal form, we split the table into two as follows:

o Now all non-key attributes are fully functional dependent only on the primary key. In [TABLE_BOOK], both [Genre ID]
and [Price] are only dependent on [Book ID]. In [TABLE_GENRE], [Genre Type] is only dependent on [Genre ID].
 Example 2: Consider the following relation and check whether it is in 3NF or not .If it is not in 3NF convert into 3NF.
score_id student_id subject_id marks exam_name total_marks
Solution: It is not in 3NF. Because,
o Primary key for our Score table is a composite key, which means it's made up of two attributes or columns
 → student_id + subject_id.
o Our new column exam_name depends on both student and subject. For example, a mechanical engineering
student will have Workshop exam but a computer science student won't. And for some subjects you have
Prctical exams and for some you don't. So we can say that exam_name is dependent on both student_id and
subject_id.
o And what about our second new column total_marks? Does it depend on our Score table's primary key?
6
o Well, the column total_marks depends on exam_name as with exam type the total score changes. For
example, practicals are of less mark while theory exams are of more marks.
o But, exam_name is just another column in the score table. It is not a primary key or even a part of the
primary key, and total_marks depends on it. This is Transitive Dependency. When a non-prime attribute
depends on other non-prime attributes rather than depending upon the prime attributes or primary key.
o Again the solution is very simple. Take out the columns exam_name and total_marks from Score table and
put them in an Exam table and use the exam_id wherever required.
Score Table: In 3rd Normal Form
score_id student_id subject_id marks exam_id
The new Exam table:
exam_id exam_name total_marks
1 Workshop 200
2 Mains 70
3 Practicals 30
 The advantage of removing transitive dependency is:
1. Amount of data duplication is reduced,
2. Data integrity achieved.

 Boyce-Codd normal form (BCNF): A relation is said to be in BCNF, if and only if every determinant should be a candidate
key.
 BCNF is the advance version of 3NF. It is stricter than 3NF.
 A table is in 3NF if for every functional dependency X → Y, X is the super key of the table.
 For BCNF, the table should be in 3NF and for every FD, LHS is super key.
 Example 1: Let's assume there is a company where employees work in more than one department. EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549
o In the above table Functional dependencies are as follows:
 EMP_ID → EMP_COUNTRY and EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
 Candidate key: {EMP-ID, EMP-DEPT}
o The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys. To convert the given table
intoBCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
Designing D394 283
283 D394 300
Stores D283 232
549 D283 549
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
7
o Functional dependencies: EMP_ID → EMP_COUNTRY and EMP_DEPT → {DEPT_TYPE,
EMP_DEPT_NO}
o Candidate keys:
 For first table: EMP_ID For second table: EMP_DEPT For third table: {EMP_ID, EMP_DEPT}
o Now this is in BCNF because left side part of both the functional dependencies is a key.
 Example 2: Below we have a college enrolment table with columns student_id, subject and professor
student_id subject professor
101 Java P.Java
101 C++ P.Cpp
102 Java P.Java2
103 C# P.Chash
104 Java P.Java
Solution: It is not in BCNF.
o One more important point to note here is, one professor teaches only one subject, but one subject may have
two different professors.
o Hence, there is a dependency between subject and professor here, where subject depends on the professor
name. This table satisfies the 1st Normal form because all the values are atomic, column names are unique
and all the values stored in a particular column are of same domain. This table also satisfies the 2nd Normal
Form as there is no Partial Dependency. And, there is no Transitive Dependency; hence the table also
satisfies the 3rd Normal Form.
o But this table is not in Boyce-Codd Normal Form. In the table above, student_id, subject form primary key,
which means subject column, is a prime attribute. But, there is one more dependency, professor →
subject. And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed by
BCNF.
o To make this relation (table) satisfy BCNF, we will decompose this table into two tables, student
tableand professor table. Below we have the structure for both the tables.
Student Table
student_id p_id
101 1
101 2
And, Professor Table
p_id professor subject
1 P.Java Java
2 P.Cpp C++
 Example 3: A more Generic Explanation: In the picture below, we have tried to explain BCNF in terms of relations.
Solution:

8
 Fourth Normal Form (4NF): A relation said to be in 4NF if it is in Boyce Codd normal form and should have no multi-valued
dependency.
 For a dependency A→ B, if for a single value of A, multiple value of B exists then the relation will be multi-valued
dependency.
 Note: Multi Valued Dependency: A table is said to have multi-valued dependency, if the following conditions are
true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the table may have multi-
valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
3. And, for a relation R (A, B, C), if there is a multi-valued dependency between, A and B, then B and C should be
independent of each other.
 If all these conditions are true for any relation (table), it is said to have multi-valued dependency.
 Example1: STUDENT
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey

o The given STUDENT table is in 3NF but the COURSE and HOBBY are two independent entity. Hence, there is
no relationship between COURSE and HOBBY. In the STUDENT relation, student with STU_ID, 21 contains
two courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-valued
dependency onSTU_ID, which leads to un-necessary repetition of data.
o So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey

IV. Concept of surrogate key


 Concept of Surrogate Key:
 Alternate of Primary Key that allows duplication of data’s/records.
 Surrogate key is a unique identification key, it is like an artificial key to production key, because the production key
may be alphanumeric or composite key but the surrogate key is always single numeric key.
 A surrogate key has the following characteristics:
i. The value is never reused and is unique within the whole system.

9
ii. It is system generated and an integer.
iii. The value cannot be manipulated by the user or application.
iv. The value is not an amalgam of different values from multiple domains.
 A Surrogate Keys can be generated in a variety of ways, and most databases offers ways to generate surrogate keys.
Example: Oracle uses SEQUENCE,
MYSQL uses Auto_Increment, and
SQL Server uses IDENTITY.

V. Lossless join and Dependency preserving decomposition

 Decomposition: Decomposition of a relation is done when a relation in relational model is not in appropriate normal form.
Relation R is decomposed into two or more relations if decomposition is lossless join as well as dependency preserving.
 If we decompose a relation R into relations R1 and R2, There are 2 types of decomposition:
1. Decomposition is lossy, if R1 ⋈ R2 ⊃ R.
 The decomposition of relation R into R1 and R2 is lossy when the join of R1 and R2 does not yield the same relation as
in R." One of the disadvantages of decomposition into two or more relational schemes (or tables) is that some
information is lost during retrieval of original relation or table.
 Consider that we have table STUDENT with three attribute roll_no , sname and department.
Roll_no Sname Dept

111 parimal COMPUTER

222 parimal ELECTRICAL


 This relation is decomposed into two relation no_name and name_dept:
Roll_no Sname

111 parimal

222 parimal

Sname Dept

Parimal COMPUTER

Parimal ELECTRICAL
 In lossy decomposition, spurious tuples are generated when a natural join is applied to the relations in the
decomposition.
stu_joined :
Roll_no Sname Dept

111 parimal COMPUTER

111 parimal ELECTRICAL

222 parimal COMPUTER

222 parimal ELECTRICAL


 The above decomposition is a bad decomposition or Lossy decomposition.
10
2. Lossless Join Decomposition: Decomposition is lossless if R1 ⋈ R2 = R
 This is also referred as non-additive decomposition.
 The lossless-join decomposition is always defined with respect to a specific set F of dependencies
 To check for lossless join decomposition using FD set, following conditions must hold:
1. Union of Attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must be either in R1 or in
R2.
Att(R1) U Att(R2) = Att(R).
2. Intersection of Attributes of R1 and R2 must not be NULL.
Att(R1) ∩ Att(R2) ≠ Φ.
3. Common attribute must be a key for at least one relation (R1 or R2)
Att(R1) ∩ Att(R2) -> Att(R1) or Att(R1) ∩ Att(R2) -> Att(R2).

 Example 1:, A relation R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and R2(AD) which is a lossless join
decomposition as:
1. First condition holds true as Att(R1) U Att(R2) = (ABC) U (AD) = (ABCD) = Att(R).
2. Second condition holds true as Att(R1) ∩ Att(R2) = (ABC) ∩ (AD) ≠ Φ
3. Third condition holds true as Att(R1) ∩ Att(R2) = A is a key of R1(ABC) because A->BC is given.

 Example 2: Consider that we have table STUDENT with three attribute roll_no, sname and department.
STUDENT:
Roll_no Sname Dept

111 parimal COMPUTER

222 parimal ELECTRICAL


This relation is decomposed into two relation Stu_name and Stu_dept:
Roll_no Sname

111 parimal

222 parimal

Roll_no Dept

111 COMPUTER

222 ELECTRICAL

Now, when these two relations are joined on the comman column 'roll_no' ,the resultant relation will look like
stu_joined.
Roll_no Sname Dept

111 parimal COMPUTER

222 parimal ELECTRICAL


In lossless decomposition, no any spurious tuples are generated when a natural joined is applied to the relations in
the decomposition.

11
 Dependency Preserving Decomposition:
 If we decompose a relation R into relations R1 and R2, All dependencies of R either must be a part of R1 or R2 or must
be derivable from combination of FD’s of R1 and R2.
 The dependency preservation decomposition is another property of decomposed relational database schema D in
which each functional dependency X -> Y specified in F either appeared directly in one of the relation schemas Ri in
the decomposed D or could be inferred from the dependencies that appear in some Ri.
 Decomposition D = {R1, R2, R3,,.., ,Rm} of R is said to be dependency-preserving with respect to F if the union of the
projections of F on each Ri , in D is equivalent to F. In other words, R ⊂ join of R1, R1 over X. The dependencies are
preserved because each dependency in F represents a constraint on the database. If decomposition is not
dependency-preserving, some dependency is lost in the decomposition.
 Example 1: A relation R (A, B, C, D) with FD set {A->BC} is decomposed into R1(ABC) and R2(AD) which is dependency
preserving because FD A->BC is a part of R1(ABC).
 Example 2: Let a relation R(A,B,C,D) and set a FDs F = { A -> B , A -> C , C -> D} are given.
A relation R is decomposed into –
R1 = (A, B, C) with FDs F1 = {A -> B, A -> C}, and
R2 = (C, D) with FDs F2 = {C -> D}.
F' = F 𝖴 F = {A -> B, A -> C, C -> D}
1 2

So, F' = F.
And so, F'+ = F+.
Thus, the decomposition is dependency preserving decomposition.

 Example: GATE Question: Consider a schema R (A, B, C, D) and functional dependencies A->B and C->D. Then the
decomposition of R into R1 (AB) and R2(CD) is
A. dependency preserving and lossless joins
B. lossless join but not dependency preserving
C. dependency preserving but not lossless join
D. not dependency preserving and not lossless join
Answer:
 For lossless join decomposition, these three conditions must hold true:
1. Att(R1) U Att(R2) = ABCD = Att(R)
2. Att(R1) ∩ Att(R2) = Φ, which violates the condition of lossless join decomposition. Hence the
decomposition is not lossless.
 For dependency preserving decomposition, A->B can be ensured in R1 (AB) and C->D can be ensured in R2 (CD).
 Hence it is dependency preserving decomposition. So, the correct option is C.

 Inference Rules on Functional Dependencies: The term Armstrong axioms refers to the sound and complete set of
inference rules or axioms, introduced by William W. Armstrong, that is used to test logical implication of functional
dependencies. If F is a set of functional dependencies then the closure of F, denoted as F+, is the set of all functional
dependencies logically implied by F. Armstrong’s Axioms are a set of rules, that when applied repeatedly, generates a
closure of functional dependencies.

12
 Advantages of Normalization:
1. Greater overall database organization will be gained.
2. The amount of unnecessary redundant data reduced.
3. Data integrity is easily maintained within the database.
4. The database & application design processes are much for flexible.
5. Security is easier to maintain or manage.
 Disadvantages of Normalization:
1. The disadvantage of normalization is that it produces a lot of tables with a relatively small number of columns.
These columns then have to be joined using their primary/foreign key relationship.
2. This has two disadvantages.
oPerformance: all the joins required to merge data slow processing & place additional stress on your
hardware.
oComplex queries: developers have to code complex queries in order to merge data from different tables.

 UNIT -III : Questions


1. a) What is Functional Dependency? Explain types and properties of FD’s.
b) What is a normal form? Explain about various normal forms with examples.

2. a) What is normalization? Differentiate between first normal form and second normal form.
b) Explain briefly about 3NF, 4NF with suitable examples?

3. a) Explain about Boyce Codd normal form with an example.


b). Why normalization is needed? Explain the process of normalization.

4. a) How to compute closure of set of functional dependency? Explain with a suitable example schema.
b). what is multi valued dependency? State and explain fourth normal form based on this concept.

5. a) List and explain the inference rules of functional dependencies.


b) Explain insertion, deletion, and modification anomalies.
c) What is the importance of dependency preservation during decomposition? How to achieve it?

13

You might also like