KEMBAR78
DBMS Module 3 | PDF | Databases | Relational Model
0% found this document useful (0 votes)
31 views24 pages

DBMS Module 3

The document discusses database anomalies that can occur due to poor data management. It describes insertion, deletion, and update anomalies and how normalization addresses these issues by reducing data redundancy. The document also covers functional dependencies, normal forms including 1NF and 2NF, and provides examples to illustrate database concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views24 pages

DBMS Module 3

The document discusses database anomalies that can occur due to poor data management. It describes insertion, deletion, and update anomalies and how normalization addresses these issues by reducing data redundancy. The document also covers functional dependencies, normal forms including 1NF and 2NF, and provides examples to illustrate database concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Anomalies in Relational Model

Anomalies in the relational model refer to inconsistencies or


errors that can arise when working with relational databases,
specifically in the context of data insertion, deletion, and
modification. There are different types of anomalies that can
occur in referencing and referenced relations which can be
discussed as:
These anomalies can be categorized into three types:
 Insertion Anomalies
 Deletion Anomalies
 Update Anomalies.

Database anomalies are the faults in the database caused due


to poor management of storing everything in the flat database.
It can be removed with the process of Normalization, which
generally splits the database which results in reducing the
anomalies in the database.

STUD_STAT STUD-
STUD_NO STUD_NAME STUD_PHONE E COUNTRY STUD_AGE

97162717 Haryan
1 RAM India 20
21 a

98982912
2 RAM Punjab India 19
81

78982919 Rajasth
3 SUJIT India 18
81 an

4 SURESH Punjab India 21


Table 1
STUDENT_COURSE
STUD_NO COURSE_NO COURSE_NAME

1 C1 DBMS

Computer
2 C2
Networks

Computer
1 C2
Networks
Table 2

Insertion Anomaly: If a tuple is inserted in referencing relation


and referencing attribute value is not present in referenced
attribute, it will not allow insertion in referencing relation.

Example: If we try to insert a record in STUDENT_COURSE with


STUD_NO =7, it will not allow it.

Deletion and Updation Anomaly: If a tuple is deleted or


updated from referenced relation and the referenced attribute
value is used by referencing attribute in referencing relation, it
will not allow deleting the tuple from referenced relation.

Example: If we want to update a record from


STUDENT_COURSE with STUD_NO =1, We have to update it in
both rows of the table. If we try to delete a record from
STUDENT with STUD_NO =1, it will not allow it.

Normalization
Normalization is the process of minimizing redundancy from
a relation or set of relations. Redundancy in relation may
cause insertion, deletion, and update anomalies. So, it helps to
minimize the redundancy in relations. Normal forms are used
to eliminate or reduce redundancy in database tables.

Normalization is the process of organizing the data and the


attributes of a database. It is performed to reduce the data
redundancy in a database and to ensure that data is stored
logically. Data redundancy in DBMS means having the same
data but at multiple places. It is necessary to remove data
redundancy because it causes anomalies in a database which
makes it very hard for a database administrator to maintain it.

Advantages

 Reduced data redundancy: Normalization helps to eliminate


duplicate data in tables, reducing the amount of storage
space needed and improving database efficiency.
 Improved data consistency: Normalization ensures that
data is stored in a consistent and organized manner,
reducing the risk of data inconsistencies and errors.
 Simplified database design: Normalization provides
guidelines for organizing tables and data relationships,
making it easier to design and maintain a database.
 Improved query performance: Normalized tables are
typically easier to search and retrieve data from, resulting in
faster query performance.
 Easier database maintenance: Normalization reduces the
complexity of a database by breaking it down into smaller,
more manageable tables, making it easier to add, modify,
and delete data.
Functional Dependency
Functional Dependency is the relationship between
attributes(characteristics) of a table related to each other. The
functional dependency of A on B is represented by A → B,
where A and B are the attributes of the relation.

Example:

roll_no name dept_namedept_building

42 abc CO A4

43 pqr IT A3

44 xyz CO A4

45 xyz IT A3

46 mno EC B2

47 jkl ME B2

From the above table we can conclude some valid functional


dependencies:
 roll_no → { name, dept_name, dept_building },→ Here,
roll_no can determine values of fields name, dept_name and
dept_building, hence a valid Functional dependency
 roll_no → dept_name , Since, roll_no can determine whole
set of {name, dept_name, dept_building}, it can determine its
subset dept_name also.
 dept_name → dept_building , Dept_name can identify the
dept_building accurately, since departments with different
dept_name will also have a different dept_building
 More valid functional dependencies: roll_no → name,
{roll_no, name} ⇢ {dept_name, dept_building}, etc.

Types of Functional dependency

1. Trivial functional dependency


o A → B has trivial functional dependency if B is a subset of A.
o The following dependencies are also trivial like: A → A, B → B

Example:

1. Consider a table with two columns Employee_Id and Employee_Name.


2. {Employee_id, Employee_Name} → Employee_Id is a trivial functional
dependency as
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.
4. Also, Employee_Id → Employee_Id and Employee_Name → Employee
_Name are trivial dependencies too.

2. Non-trivial functional dependency


o A → B has a non-trivial functional dependency if B is not a subset
of A.
o When A intersection B is NULL, then A → B is called as complete
non-trivial.

Example:

1. ID → Name,
2. Name → DOB
3. Multivalued Functional Dependency in DBMS

 In Multivalued functional dependency, attributes in


the dependent set are not dependent on each other.
 For example, X → { Y, Z }, if there exists is no
functional dependency between Y and Z, then it is
called as Multivalued functional dependency.
 For example, consider the Employee table below.

Employee_Id Name Age

1 Zayn 24

2 Phobe 34

3 Hikki 26

4 David 29

4 Phobe 24

 Here, { Employee_Id } → { Name, Age } is a


Multivalued functional dependency, since the
dependent attributes Name, Age are
not functionally dependent(i.e. Name → Age or Age
→ Name doesn’t exist !).
4. Transitive Functional Dependency in DBMS

 Consider two functional dependencies A → B and B


→ C then according to the transitivity axiom A →
C must also exist. This is called a transitive
functional dependency.
 In other words, dependent is indirectly dependent
on determinant in Transitive functional dependency.
 For example, consider the Employee table below.

Employee_Id Name Department Street Number

1 Zayn CD 11

2 Phobe AB 24

3 Hikki CD 11

4 David PQ 71

5 Phobe LM 21

 Here, { Employee_Id → Department


} and { Department → Street Number } holds
true. Hence, according to the axiom of
transitivity, { Employee_Id → Street Number } is a
valid functional dependency.

Types of Normal Forms:


Normalization works through a series of stages called Normal
forms. The normal forms apply to individual relations. The
relation is said to be in particular normal form if it satisfies
constraints.

First Normal Form (1NF)

A relation is in 1NF if every attribute is a single-valued attribute


or it does not contain any multi-valued or composite attribute,
i.e., every attribute is an atomic attribute. If there is a composite
or multi-valued attribute, it violates the 1NF. To solve this, we
can create a new row for each of the values of the multi-valued
attribute to convert the table into the 1NF.

Let’s take an example of a relational table <EmployeeDetail>


that contains the details of the employees of the company.

<EmployeeDetail>

Employee Code Employee Name Employee Phone Number

101 John 98765623,998234123

101 John 89023467

102 Ryan 76213908

103 Stephanie 98132452

Here, the Employee Phone Number is a multi-valued attribute.


So, this relation is not in 1NF.

To convert this table into 1NF, we make new rows with each
Employee Phone Number as a new row as shown below:

<EmployeeDetail>

Employee Code Employee Name Employee Phone Number

101 John 998234123

101 John 98765623

101 John 89023467

102 Ryan 76213908

103 Stephanie 98132452

Second Normal Form (2NF)

The normalization of 1NF relations to 2NF involves the


elimination of partial dependencies. A partial dependency in
DBMS exists when any non-prime attributes, i.e., an attribute
not a part of the candidate key, is not fully functionally
dependent on one of the candidate keys.

For a relational table to be in second normal form, it must


satisfy the following rules:
1. The table must be in first normal form.
2. It must not contain any partial dependency, i.e., all non-prime
attributes are fully functionally dependent on the primary key.

If a partial dependency exists, we can divide the table to


remove the partially dependent attributes and move them to
some other table where they fit in well.

Let us take an example of the following


<EmployeeProjectDetail> table to understand what is partial
dependency and how to normalize the table to the second
normal form:

<EmployeeProjectDetail>

Employee Code Project ID Employee Name Project Name

101 P03 John Project103

101 P01 John Project101

102 P04 Ryan Project104

103 P02 Stephanie Project102

In the above table, the prime attributes of the table are


Employee Code and Project ID. We have partial dependencies in
this table because Employee Name can be determined by
Employee Code and Project Name can be determined by
Project ID. Thus, the above relational table violates the rule of
2NF.
The prime attributes in DBMS are those which are part of one or
more candidate keys.

To remove partial dependencies from this table and normalize


it into second normal form, we can decompose the
<EmployeeProjectDetail> table into the following three tables:

<EmployeeDetail>

Employee Code Employee Name

101 John

101 John

102 Ryan

103 Stephanie

<EmployeeProject>

Employee Code Project ID

101 P03

101 P01

102 P04

103 P02

<ProjectDetail>

Project ID Project Name

P03 Project103

P01 Project101
Project ID Project Name

P04 Project104

P02 Project102

Thus, we’ve converted the <EmployeeProjectDetail> table into


2NF by decomposing it into <EmployeeDetail>, <ProjectDetail>
and <EmployeeProject> tables. As you can see, the above
tables satisfy the following two rules of 2NF as they are in 1NF
and every non-prime attribute is fully dependent on the primary
key.

Third Normal Form (3NF)


The normalization of 2NF relations to 3NF involves the
elimination of transitive dependencies in DBMS.

For a relational table to be in third normal form, it must satisfy


the following rules:

1. The table must be in the second normal form.


2. No non-prime attribute is transitively dependent on the
primary key.
3. For each functional dependency X -> Z at least one of the
following conditions hold:

 X is a super key of the table.


 Z is a prime attribute of the table.

If a transitive dependency exists, we can divide the table to


remove the transitively dependent attributes and place them to
a new table along with a copy of the determinant.

Let us take an example of the following <EmployeeDetail> table


to understand what is transitive dependency and how to
normalize the table to the third normal form:
<EmployeeDetail>

Employee Employee Employee Employee


Code Name Zipcode City
101 John 110033 Model Town
101 John 110044 Badarpur
102 Ryan 110028 Naraina
103 Stephanie 110064 Hari Nagar
The above table is not in 3NF because it has Employee Code ->
Employee City transitive dependency because:

 Employee Code -> Employee Zipcode


 Employee Zipcode -> Employee City

Also, Employee Zipcode is not a super key and Employee City is


not a prime attribute.

To remove transitive dependency from this table and normalize


it into the third normal form, we can decompose the
<EmployeeDetail> table into the following two tables:

<EmployeeDetail>

Employee Code Employee Name Employee Zipcode


101 John 110033
101 John 110044
102 Ryan 110028
103 Stephanie 110064
<EmployeeLocation>

Employee Zipcode Employee City


110033 Model Town
110044 Badarpur
110028 Naraina
110064 Hari Nagar
Thus, we’ve converted the <EmployeeDetail> table into 3NF by
decomposing it into <EmployeeDetail> and
<EmployeeLocation> tables as they are in 2NF and they don’t
have any transitive dependency.

Boyce-Codd Normal Form (BCNF)

Boyce-Codd Normal Form(BCNF) is an advanced version of


3NF as it contains additional constraints compared to 3NF.

For a relational table to be in Boyce-Codd normal form, it must


satisfy the following rules:
1. The table must be in the third normal form.
2. For every non-trivial functional dependency X -> Y, X is the
superkey of the table. That means X cannot be a non-prime
attribute if Y is a prime attribute.

A superkey is a set of one or more attributes that can uniquely


identify a row in a database table.

Let us take an example of the following


<EmployeeProjectLead> table to understand how to normalize
the table to the BCNF:

<EmployeeProjectLead>

Employee Code Project ID Project Leader

101 P03 Grey

101 P01 Christian

102 P04 Hudson

103 P02 Petro

The above table satisfies all the normal forms till 3NF, but it
violates the rules of BCNF because the candidate key of the
above table is {Employee Code, Project ID}. For the non-trivial
functional dependency, Project Leader -> Project ID, Project ID
is a prime attribute but Project Leader is a non-prime attribute.
This is not allowed in BCNF.

To convert the given table into BCNF, we decompose it into


three tables:

<EmployeeProject>

Employee Code Project ID

101 P03

101 P01

102 P04

103 P02

<ProjectLead>

Project Leader Project ID

Grey P03

Christian P01

Hudson P04

Petro P02

Thus, we’ve converted the <EmployeeProjectLead> table into


BCNF by decomposing it into <EmployeeProject> and
<ProjectLead> tables.

Decomposition In DBMS

Decomposition refers to the division of tables into multiple


tables to produce consistency in the data. When we divide a
table into multiple tables or divide a relation into multiple
relations, then this process is termed Decomposition in DBMS.
We perform decomposition in DBMS when we want to process
a particular data set. It is performed in a database
management system when we need to ensure consistency and
remove anomalies and duplicate data present in the database.
When we perform decomposition in DBMS, we must try to
ensure that no information or data is lost.

Types of Decomposition
There are two types of Decomposition:
 Lossless Decomposition
 Lossy Decomposition

Lossless Decomposition
The process in which where we can regain the original relation
R with the help of joins from the multiple relations formed
after decomposition. This process is termed as lossless
decomposition. It is used to remove the redundant data from
the database while retaining the useful information. The
lossless decomposition tries to ensure following things:
 While regaining the original relation, no information should
be lost.
 If we perform join operation on the sub-divided relations, we
must get the original relation.
Example:
There is a relation called R(A, B, C)
A B C

5 1 2
5 6 7
A B C

4 5 8
8 2 9
Now we decompose this relation into two sub relations R1 and
R2
R1(A, B)
A B

5 1
5 6

4 5
8 2
R2(B, C)
B C

1 2
6 7

5 8
2 9
After performing the Join operation we get the same original
relation
A B C

5 1 2
5 6 7

4 5 8
8 2 9
Lossy Decomposition
As the name suggests, lossy decomposition means when we
perform join operation on the sub-relations it doesn’t result to
the same relation which was decomposed. After the join
operation, we always found some extraneous tuples. These
extra tuples genrates difficulty for the user to identify the
original tuples.
Example:
We have a relation R(A, B, C)
A B C

1 2 1

2 5 3

3 3 3
Now , we decompose it into sub-relations R1 and R2
R1(A, B)
A B

1 2

2 5

3 3
R2(B, C)
B C

2 1

5 3
B C

3 3
Now After performing join operation
A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3
Properties of Decomposition
 Lossless: All the decomposition that we perform in
Database management system should be lossless. All the
information should not be lost while performing the join on
the sub-relation to get back the original relation. It helps to
remove the redundant data from the database.
 Dependency Preservation: Dependency Preservation is an
important technique in database management system. It
ensures that the functional dependencies between the
entities is maintained while performing decomposition. It
helps to improve the database efficiency, maintain
consistency and integrity.
 Lack of Data Redundancy: Data Redundancy is generally
termed as duplicate data or repeated data. This property
states that the decomposition performed should not suffer
redundant data. It will help us to get rid of unwanted data
and focus only on the useful data or information.
Multivalued Dependency (MVD) in DBMS

Multivalued dependency (MVD) occurs when two attributes (B


and C) in a table independently depend on a third attribute (A).
This means A's value determines multiple values for both B and
C. It's a complex relationship that can impact database
normalization and integrity.

To illustrate: A person (A) can have multiple projects (B) and


hobbies (C). In this case, Project and Hobby are multivalued
attributes. MVD is written as A -->> B, meaning A determines
multiple values of B.

Let's have an example to understand multi-valued dependency:


The below table shows the details of an office department
exchange event having the
columns, EMPLOYEE_ID, DEPARTMENT, and HOBBY.

EMPLOYEE_ID DEPARTMENT HOBBY


E901 HR Badminton
E901 Sales Reading
E902 Marketing Cricket
E903 Finance Football
As you can see in the above table, Employee E901 is interested
in two departments HR and Sales, and, has two hobbies
Badminton and Reading. This will result in multiple records
for E901 as,

EMPLOYEE_ID DEPARTMENT HOBBY


E901 HR Badminton
E901 Sales Reading
EMPLOYEE_ID DEPARTMENT HOBBY
E901 HR Reading
E901 Sales Badminton
In the above table, you can see that for the Employee E901
multiple records exist in the DEPARTMENT and the HOBBY
attribute. Hence the multi-valued dependencies are,

EMPLOYEE_ID −>−> DEPARTMENT and


EMPLOYEE_ID −>−> HOBBY
Also, the DEPARTMENT and HOBBY attributes are independent
of each other thus leading to a multi-valued dependency in the
above table.

Fourth Normal Form (4NF)


Rules for 4NF

A table is said to be in 4NF if the following conditions are met,


 The table is in Boyce-Codd Normal Form (BCNF).
 The table does not have multi-valued dependency.

Let's see how we can satisfy the conditions of 4NF in DBMS by


looking at the example we used in Multi-Valued Dependency

EMPLOYEE_ID DEPARTMENT HOBBY


E901 HR Badminton
E901 Sales Reading
E902 Marketing Cricket
E903 Finance Football
The multi-valued dependencies in the above table are,

EMPLOYEE_ID −>−> DEPARTMENT and


EMPLOYEE_ID −>−> HOBBY
To satisfy the fourth normal form, we can decompose the
relation into two tables,

EMPLOYEE_ID DEPARTMENT
E901 HR
E901 Sales
E902 Marketing
E903 Finance
and,

EMPLOYEE_ID HOBBY
E901 Badminton
E901 Reading
E902 Cricket
E903 Football
Now the above relation is in 4NF.

Join Dependency
Join Dependency is similar to Multi-Valued Dependency as Join
Dependency is also a constraint.

A Join Dependency on a relation schema R, specifies a


constraint on states, r of R that every legal state r of R should
have a lossless join decomposition into 𝑅 1R1, 𝑅 2R2,..., 𝑅
𝑛 Rn. In a database management system, join dependency is a
generalization of the idea of multivalued dependency.

Let R be a relation schema and the decompositions of R are 𝑅


1,𝑅 2,𝑅 3,...,𝑅 𝑛 R1,R2,R3,...,Rn. R is said to be in Join
Dependency if and only if every instance of R, r is equal to the
join of its projections on 𝑅 1,𝑅 2,𝑅 3,...,𝑅 𝑛 R1,R2,R3,...,Rn.
Example of Join Dependency

Suppose we have the following table R:

E_Name Company Product

Rohan Comp1 Jeans

Harpreet Comp2 Jacket

Anant Comp3 TShirt

 We can break, or decompose the above table into three tables, this
would mean that the table is not in 5NF!
 The three decomposed tables would be:

1. R1: The table with columns E_Name and Company.

E_Name Company

Rohan Comp1

Harpreet Comp2

Anant Comp3

2. R2: The table with columns E_Name and Product.

E_Name Product

Rohan Jeans

Harpreet Jacket

Anant TShirt

3. R3: The table with columns Company and Product.

Company Product

Comp1 Jeans
Company Product

Comp2 Jacket

Comp3 TShirt

Let's try to figure out whether or not R has a join dependency.

Step 1- First, the natural join of R1 and R2:

E_Name Company Product

Rohan Comp1 Jeans

Harpreet Comp2 Jacket

Anant Comp3 TShirt

Step 2- Next, let's perform the natural join of the above table
with R3:

E_Name Company Product

Rohan Comp1 Jeans

Harpreet Comp2 Jacket

Anant Comp3 TShirt

In the above example, we do get the same table R after


performing the natural joins at both steps, luckily.

Therefore, our join dependency comes out to be: {(E_Name,


Company ), (E_Name, Product), (Company, Product)}

Because the above-mentioned relations are joined dependent,


they are not 5NF. That is, a join relation of the three relations
above is equal to our initial relation table R.
Join Dependencies and Fifth Normal Form (5NF)
 If a relation is in 4NF and does not contain any join dependencies,
it is in 5NF.
 To avoid redundancy, 5NF is satisfied when all tables are divided
into as many tables as possible.

Conclusion: if a relation has join dependency, it won't be in 5NF.

You might also like