KEMBAR78
DBMS Unit 3 Notes by MultiAtomsPlus | PDF | Data | Databases
0% found this document useful (0 votes)
562 views26 pages

DBMS Unit 3 Notes by MultiAtomsPlus

Uploaded by

deepansh432109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
562 views26 pages

DBMS Unit 3 Notes by MultiAtomsPlus

Uploaded by

deepansh432109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Aktu Database Management System BCS-501

Unit-3
Syllabus

Topics Covered

Introduction to Database Design & Normalization


What is a Functional Dependency? - (Aktu 23-24)
Armstrong’s Axioms - (Aktu 21-22)
Types of Functional Dependencies
Canonical Cover in FDs -(Aktu 23-24)
Normalization and Normal Forms
Step-by-Step Explanation of Normal Forms (1st,2nd,3rd & BCNF)
Prime & Non Prime Attributes
How to Find Candidate Key
Inclusion Dependency
Lossless Join Decomposition
Multivalued Dependencies (MVDs)
Join Dependency (JD)
4NF & 5NF
Alternate Approaches to Database Design

Introduction to Database Design & Normalization

What is Database Design?


It is the process of organizing data into structured formats that ensure consistency,
efficiency, and accuracy. It involves defining:
Schema: The logical structure of the database, including tables, columns, and
relationships.
Constraints: Rules to maintain data integrity.
g y
Relationships: How data in one table is related to data in another (e.g., primary and
foreign keys).

What is Normalization?
It is a systematic process in database design to organize data into multiple tables to:

Reduce redundancy: Avoid duplicate data storage.


Ensure data integrity: Prevent anomalies like inconsistent updates or deletions.

It involves dividing large tables into smaller, well-structured tables while maintaining
relationships between them. This process adheres to a set of rules, called normal forms,
to achieve an efficient and error-free database design.

Importance of Normalization

Eliminates Redundancy
Improves Data Integrity
Facilitates Scalability

Normalized databases are easier to manage and scale as the amount of data grows.

Example

Unnormalized Table:

Problems:

Redundancy: "Alice" and her details are repeated.


Update Anomaly: Changing "Dr. Smith's" phone number requires multiple
updates.
Delete Anomaly: Deleting "Math" for Alice might remove her instructor
details.

Normalized Table: Table 1 (Students):


Table 2 (Courses):

Table 3 (Instructors):

Table 4 (Enrollments):

What is a Functional Dependency? AKTU- 2023-24

A Functional Dependency (FD) is a rule that defines how one column in a table
determines the value of another column.

Denoted as X → Y, where:
X (Determinant): The column(s) whose values decide another column.
Y (Dependent): The column whose value depends on X.

StudentID → Name: If you know the StudentID, you can find the Name.
Course → Name: This is not true, as the same course could be taken by multiple students.

Why is Functional Dependency Important?


Organize Data: By identifying relationships between columns, we can design better
databases.
Remove Redundancy: Prevents storing unnecessary repeated data.
Avoid Errors: Reduces issues like incorrect or inconsistent data.
AKTU- 2021-22

Armstrong’s Axioms / Inference Rules for Functional Dependencies

Armstrong's axioms are basic rules to derive all possible functional dependencies
(FDs) from a given set of FDs. There are three main axioms:

1. Reflexivity
Rule: If a set of attributes (Y) is a subset of another set of attributes (X), then X → Y
holds true.
Meaning: Any attribute or group of attributes determines itself or its subset.

{Roll_No, Name} → Name: Since Name is a part (subset) of {Roll_No, Name}, this FD is
valid.
Roll_No → Roll_No: Any column always determines itself.

2. Augmentation

Rule: If X → Y is a valid FD, then X + Z → Y + Z is also valid.


Meaning: Adding the same attribute(s) (Z) to both sides of a valid FD doesn’t change the
dependency.

Roll_No → Name: If you know Roll_No, you can determine Name.


By augmentation, Roll_No + Dept → Name + Dept is also valid.
3. Transitivity

Rule: If X → Y and Y → Z are valid FDs, then X → Z is also valid.


Meaning: If one attribute depends on a second, and the second depends on a third, then the
first indirectly determines the third.

Roll_No → Dept: Knowing Roll_No gives the Dept.


Dept → Building: Knowing Dept gives the Building.
By transitivity, Roll_No → Building is valid.

Other Properties Derived from Axioms:

Union: If X → Y and X → Z, then X → YZ.


Decomposition: If X → YZ, then X → Y and X → Z.

Types of Functional Dependencies

1. Trivial Functional Dependency


A dependency is trivial if the dependent column is already part of the determinant.

{StudentID, Name} → Name: This is trivial because Name is already part of the
determinant {StudentID, Name}.
StudentID → StudentID: This is also trivial.

2. Non-Trivial Functional Dependency


A dependency is non-trivial if the dependent column is not part of the determinant.
StudentID → Name: Knowing StudentID gives us the Name, which is not part of StudentID.

3. Multivalued Functional Dependency


It occurs when one column determines multiple columns that are independent of each
other.

StudentID → {Hobby, Skill}, but Hobby and Skill do not depend on each other.

4. Transitive Functional Dependency


A dependency is transitive if one column depends on another through a third column.

StudentID → Class and Class → Teacher, so StudentID → Teacher is transitive.

5. Partial Functional Dependency


It occurs when a non-key attribute depends only on part of a composite key, not the
whole key.
If {CourseID, Semester} is the composite key, but only CourseID → Credits, it's a
partial dependency.

6. Fully Functional Dependency


A dependency is fully functional when a column depends on the entire composite key, not
just part of it.

{OrderID, ProductID} → Quantity depends on the full composite key, so it's fully
functional.

Canonical Cover in Functional Dependencies AKTU- 2023-24

It is the simplified version of a set of Functional Dependencies (FDs). It removes


unnecessary dependencies and makes the FD set smaller without changing its meaning.
This helps in tasks like normalization, decomposition, and checking dependency
preservation.

Steps to Calculate Canonical Cover


Let’s break it down step by step with a simple example:
Given: Functional Dependencies (FDs)
We have a relation 𝑅(𝐴, 𝐵, 𝐶, 𝐷) with the following FDs:
1. 𝐴 → 𝐵𝐶
2. 𝐵 → 𝐶

3. 𝐴 → 𝐵

4. 𝐴𝐵 → 𝐶
Step 1: Split RHS of FDs

If any FD has multiple attributes on the right-hand side, split it into multiple FDs
with one attribute on the right side.

𝐴 → 𝐵𝐶 becomes:
𝐴→𝐵
𝐴→𝐶

Now the updated FDs are:

1. 𝐴 → 𝐵
2. 𝐴 → 𝐶
3. 𝐵 → 𝐶
4. 𝐴 → 𝐵
1. 5. 𝐴𝐵 → 𝐶

Step 2: Remove Redundant FDs

Check if any FD is unnecessary (i.e., it can be derived from the others).

FD 𝐴 → 𝐵 is repeated, so remove the duplicate.

Updated FDs:

1. 𝐴 → 𝐵
2. 𝐴 → 𝐶
3. 𝐵 → 𝐶
4. 𝐴𝐵 → 𝐶

Step 3: Remove Redundant Attributes from LHS

Check if any attribute in the left-hand side of an FD is unnecessary.

Let’s analyze 𝐴𝐵 → 𝐶:

Test if 𝐴 → 𝐶 can already imply 𝐴𝐵 → 𝐶.


Yes, because 𝐴 → 𝐶 is already present.

So, 𝐴𝐵 → 𝐶 is redundant. Remove it.


Updated FDs:

1. 𝐴 → 𝐵
2. 𝐴 → 𝐶
3. 𝐵 → 𝐶

Step 4: Final Canonical Cover


The final set of simplified FDs is:
𝐴→𝐵
𝐴→𝐶
𝐵→𝐶

Normalization and Normal Forms

Normalization is the process of organizing data in a database to reduce redundancy


and improve data integrity. It involves dividing a database into smaller tables and
defining relationships between them to ensure consistency and eliminate anomalies
like update, delete, and insert anomalies.

Importance of Normalization:
Minimizes Redundancy: Reduces duplicate data.
Improves Data Integrity: Ensures that data is accurate and consistent.
Simplifies Maintenance: Makes it easier to update data.
Optimizes Queries: Improves database performance by reducing unnecessary
joins.
Step-by-Step Explanation of Normal Forms

1.First Normal Form (1NF):


A relation is in 1NF if:

Each column contains atomic (indivisible) values.


Each record is unique.

Example: Improper 1NF Table

Solution (Convert to 1NF):

Prime and Non-Prime Attributes

Prime Attributes
A prime attribute is an attribute that is part of at least one candidate key of a relation.
Candidate Key: A minimal set of attributes that can uniquely identify every tuple (row) in a
relation.

Non-Prime Attributes
A non-prime attribute is an attribute that is not part of any candidate key.

Example
Consider a relation 𝑅(𝐴, 𝐵, 𝐶, 𝐷) with the following Functional Dependencies (FDs):

1. 𝐴𝐵 →𝐶
2. 𝐶 → 𝐷

Step 1: Find the Candidate Key(s):

𝐴𝐵 is a candidate key because it can uniquely identify all attributes in the relation.

S 2 Id if P i d N P i A ib
Step 2: Identify Prime and Non-Prime Attributes:

Prime Attributes: 𝐴, 𝐵 (because they are part of the candidate key 𝐴𝐵).
Non-Prime Attributes: 𝐶, 𝐷 (because they are not part of any candidate key).

How to Find Candidate Key


Given Relation:
𝑅(𝐴, 𝐵, 𝐶, 𝐷)

Functional Dependencies:
1. 𝐴𝐵 → 𝐶
2. 𝐶 → 𝐷

Step 1: List All Attributes


𝑅 = {𝐴, 𝐵, 𝐶, 𝐷}

Step 2: Identify Closures


𝐴𝐵 + = {𝐴, 𝐵, 𝐶, 𝐷}: 𝐴𝐵 determines all attributes.
𝐴+ = {𝐴}: 𝐴 does not determine all attributes.
𝐵 + = {𝐵}: 𝐵 does not determine all attributes.
𝐶+ = {𝐶, 𝐷}: 𝐶 determines 𝐷, but not 𝐴 or 𝐵.

Step 3: Check Superkeys


𝐴𝐵 is a superkey because 𝐴𝐵 + contains all attributes.
Step 4: Identify Candidate Key
𝐴𝐵 is the candidate key because it is minimal and can determine all attributes.

Tips
1. Always start by calculating closures.
2. Look for minimal sets of attributes that can uniquely identify all attributes.
3. Eliminate redundancy to identify candidate keys.

2.Second Normal Form (2NF):


A table is in Second Normal Form (2NF) if:

It is in First Normal Form (1NF)


It has no partial dependencies, meaning no non-prime attribute depends on a part of a
composite key.

Steps to Achieve 2NF

1. Start with a 1NF table.


Ensure no multi-valued or repeating attributes exist.
m p g

2. Check for Partial Dependencies.


Identify if any non-prime attribute depends on only a portion of a composite key.

3. Remove Partial Dependencies.


Create new tables to separate these dependencies.

4. Reorganize the Table.


Ensure that every non-prime attribute depends on the whole candidate key.

Example
Table: Student_Subject

Candidate Key: {𝑅𝑜𝑙𝑙_𝑁𝑜, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝐶𝑜𝑑𝑒}


Non-Prime Attributes: 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝑁𝑎𝑚𝑒, 𝑆𝑡𝑢𝑑𝑒𝑛𝑡_𝑁𝑎𝑚𝑒

Step 1: Check for Partial Dependencies

Subject_Name depends only on Subject_Code (a part of the composite key).


𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝐶𝑜𝑑𝑒 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡_𝑁𝑎𝑚𝑒 → Partial Dependency
Student_Name depends only on Roll_No (a part of the composite key).
𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑡𝑢𝑑𝑒𝑛𝑡_𝑁𝑎𝑚𝑒 → Partial Dependency

Step 2: Remove Partial Dependencies

Decompose into two tables:

1. Student Table

3. Student_Subject Table (Relationship Table)


2. Subject Table
HomeWork for 2nd NF

3.Third Normal Form (3NF):


A table is in Third Normal Form (3NF) if:

It is in Second Normal Form (2NF).


There are no transitive dependencies, meaning no non-prime attribute depends on another
non-prime attribute.

Steps to Achieve 3NF

1.Start with a 2NF table.


Ensure there are no partial dependencies.

2.Check for Transitive Dependencies.


Identify if any non-prime attribute depends on another non-prime attribute.

3.Remove Transitive Dependencies.


Create new tables to separate these dependencies.

4.Reorganize the Table.


Ensure that every non-prime attribute is directly dependent on the key, not on another non-prime
attribute.

Example
Table: Employee

Candidate Key: 𝐸𝑚𝑝_𝐼𝐷


Non-Prime Attributes: 𝐷𝑒𝑝𝑡𝐼 𝐷, 𝐷𝑒𝑝𝑡𝑁 𝑎𝑚𝑒, 𝐷𝑒𝑝𝑡𝐿 𝑜𝑐𝑎𝑡𝑖𝑜𝑛
Step 1: Check for Transitive Dependencies

𝐸𝑚𝑝_𝐼𝐷 → 𝐷𝑒𝑝𝑡_𝐼𝐷
𝐷𝑒𝑝𝑡_𝐼𝐷 → 𝐷𝑒𝑝𝑡_𝑁𝑎𝑚𝑒, 𝐷𝑒𝑝𝑡_𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛

Problem:

𝐷𝑒𝑝𝑡_𝑁𝑎𝑚𝑒 and 𝐷𝑒𝑝𝑡_𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛 are dependent on 𝐷𝑒𝑝𝑡_𝐼𝐷, not directly on Emp_ID

Step 2: Remove Partial Dependencies

Decompose into two tables:

1. Employee Table

2. Department Table

4.Boyce-Codd Normal Form (BCNF):


A table is in BCNF if:

It is in Third Normal Form (3NF).


For every functional dependency (X → Y), X (determinant) must be a superkey.

Key Terms

Superkey: An attribute or a set of attributes that can uniquely identify a row.


Determinant: The left-hand side (X) of a functional dependency 𝑋 → 𝑌.
Steps to Achieve BCNF

1. Check if the Table is in 3NF.


Start from a 3NF table.

2. VerifyFunctional Dependencies.
For each dependency 𝑋 → 𝑌, check if 𝑋 is a superkey.

3. Decompose if Needed.
If 𝑋 is not a superkey, decompose the table into smaller tables until every determinant is a
superkey.

Example
Table: Student

Functional Dependencies:

1. 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡
2. 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟

Candidate Key: 𝑅𝑜𝑙𝑙_𝑁𝑜


Problem:
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟: 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 is not a superkey.
This violates BCNF.

Step 1: Decompose the Table


Split into two tables:

1. Student_Subject Table
2. Subject_Teacher Table

Step 2: Verify BCNF

1. In Student_Subject, 𝑅𝑜𝑙𝑙_𝑁𝑜 → 𝑆𝑢𝑏𝑗𝑒𝑐𝑡, and 𝑅𝑜𝑙𝑙_𝑁𝑜 is a superkey.


2. In Subject_Teacher, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 → 𝑇𝑒𝑎𝑐ℎ𝑒𝑟, and 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 is a superkey.

AKTU- 2023-24

Q: Why do we normalize databases?

Normalization reduces redundancy, improves data integrity, avoids anomalies, and makes
databases easier to maintain.

AKTU- 2022-23

Q: List all prime and non-prime attributes

Relation R(A, B, C, D, E)
FDs: 𝐴𝐵 → 𝐶, 𝐵 → 𝐸, 𝐶 → 𝐷
Candidate Key: 𝐴𝐵.
Answer:
Prime Attributes: A, B.
Non-Prime Attributes: C, D, E.

Q: Find Key for R and Decompose into 2NF and 3NF


AKTU- 2022-23
Relation R: {A, B, C, D, E, F, G, H, I, J}
Functional Dependencies:
AB-> C
A-> DE
B->F
F->GH
D->IJ
Step 1: Find the Candidate Key
Given Functional Dependencies:

1. 𝐴𝐵 → 𝐶
2. 𝐴 → 𝐷𝐸

3. 𝐵 → 𝐹

4. 𝐹 → 𝐺𝐻

5. 𝐷 → 𝐼𝐽

Attributes: 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹, 𝐺, 𝐻, 𝐼, 𝐽

Step-by-Step Process:

𝐴𝐵 determines 𝐶.
𝐴 determines 𝐷, 𝐸 and 𝐵 determines 𝐹.
𝐹 determines 𝐺, 𝐻 and 𝐷 determines 𝐼, 𝐽.
By combining, 𝐴𝐵 can determine all other attributes (𝐴𝐵+ = {𝐴, 𝐵, 𝐶, 𝐷, 𝐸, 𝐹, 𝐺, 𝐻, 𝐼, 𝐽}.

Candidate Key: 𝐴𝐵 is the only candidate key.

Step 2: Decompose into 2NF


To achieve 2NF, remove partial dependencies (when a non-prime attribute depends only on a
part of the candidate key).

Partial Dependencies:

1. 𝐴 → 𝐷𝐸 (Depends on part of the key 𝐴).


2. 𝐵 → 𝐹 (Depends on part of the key 𝐵).

Decomposition into 2NF Relations:

1. Relation 𝑅1(𝐴, 𝐷, 𝐸): Includes 𝐴 → 𝐷𝐸.

Attributes: 𝐴, 𝐷, 𝐸.
Key: 𝐴.

2. Relation 𝑅2(𝐵, 𝐹): Includes 𝐵 → 𝐹.

Attributes: 𝐵, 𝐹.
Key: 𝐵.

3. Relation 𝑅3(𝐹, 𝐺, 𝐻): Includes 𝐹 → 𝐺𝐻.

Attributes: 𝐹, 𝐺, 𝐻.
Key: 𝐹
Key: 𝐹.

4. Relation 𝑅4(𝐷, 𝐼, 𝐽): Includes 𝐷 → 𝐼𝐽.

Attributes: 𝐷, 𝐼, 𝐽.
Key: 𝐷.

5. Relation 𝑅5(𝐴𝐵, 𝐶): Includes 𝐴𝐵 → 𝐶.

Attributes: 𝐴, 𝐵, 𝐶.
Key: 𝐴𝐵.

Step 3: Decompose into 3NF


3NF eliminates transitive dependencies, where a non-prime attribute depends on
another non-prime attribute.

Steps for 3NF Decomposition:

Check each relation from 2NF:


1. 𝑅1 (𝐴, 𝐷, 𝐸): No transitive dependency, already in 3NF.
2. 𝑅2 (𝐵, 𝐹): No transitive dependency, already in 3NF.
3. 𝑅3 (𝐹, 𝐺, 𝐻): No transitive dependency, already in 3NF.
4. 𝑅4 (𝐷, 𝐼, 𝐽): No transitive dependency, already in 3NF.
5. 𝑅5 (𝐴, 𝐵, 𝐶): No transitive dependency, already in 3NF.

Final Decomposed Relations


The 3NF decomposition results in the following relations:
1. 𝑅1(𝐴, 𝐷, 𝐸) with 𝐴 → 𝐷𝐸.
2. 𝑅2(𝐵, 𝐹) with 𝐵 → 𝐹.
3. 𝑅3(𝐹, 𝐺, 𝐻) with 𝐹 → 𝐺𝐻.
4. 𝑅4(𝐷, 𝐼, 𝐽) with 𝐷 → 𝐼𝐽.
5. 𝑅5(𝐴, 𝐵, 𝐶) with 𝐴𝐵 → 𝐶.

Subscribe Multi Atoms & Multi Atoms Plus


Subscribe Multi Atoms & Multi Atoms Plus

What is Inclusion Dependency?

An Inclusion Dependency is a constraint that specifies that the values in one set of
columns in a table (relation) must appear in another set of columns in another table
(or the same table). It expresses a subset relationship between attributes.

Student Table

Course Table

The course_id column in Students must match a value in the course_id column in Courses.
This is an Inclusion Dependency.

It ensures that every course a student takes (in the Students table) actually exists in the list of
courses (in the Courses table).
Constraint: 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝑆𝑡𝑢𝑑𝑒𝑛𝑡) ⊆ 𝜋𝑐𝑜𝑢𝑟𝑠𝑒𝑖 𝑑 (𝐶𝑜𝑢𝑟𝑠𝑒)

Why Is It Important?

1. Keeps Data Consistent: It prevents errors, like assigning a student to a course that
doesn’t exist.
2. Maintains Relationships: It keeps the "connections" between tables intact.
3. Avoids Data Loss: If a course is removed, any student enrolled in that course can also
be removed safely, keeping the database clean.

Lossless Join Decomposition

When we split a large table (relation 𝑅) into smaller tables (like 𝑅1 and 𝑅2), lossless join
decomposition ensures that no data is lost when we combine (join) these smaller tables back
to reconstruct the original table.

Why is it Important?
If decomposition is not lossless, some data may be missing or duplicated when we try to
reconstruct the original table.
A lossless decomposition guarantees that the original data can always be obtained by a
natural join operation on the decomposed tables.

Conditions for Lossless Join Decomposition

To check if the decomposition of 𝑅 into 𝑅1 and 𝑅2 is lossless, the following conditions


must hold:

1.Union of Attributes:
The combined attributes of 𝑅1 and 𝑅2 must equal all the attributes of 𝑅.
Mathematically:
Attributes(𝑅1) ∪ Attributes(𝑅2) = Attributes(𝑅)

If this condition fails, the decomposition is not lossless.


2.Intersection of Attributes:
There must be at least one common attribute between 𝑅1 and 𝑅2.
Mathematically:
Attributes(𝑅1) ∩ Attributes(𝑅2) ≠ Φ

If 𝑅1 and 𝑅2 have no common attributes, they can’t be joined to reconstruct 𝑅.

3.Key Property of Common Attributes:


The common attribute(s) must form a key in at least one of the smaller tables (𝑅1 or 𝑅2).
Mathematically:
Attributes(𝑅1) ∩ Attributes(𝑅2) → Attributes(𝑅1) or Attributes(𝑅1) ∩ Attributes(𝑅2) → Attributes(𝑅2)

This ensures that the common attribute(s) can uniquely identify tuples in at least one of the
decomposed tables.

Aktu-2022-23

Q. Given the following set of FDs on schema R (V,W,X,Y,Z)


{Z→V, W→Y, XY→Z, V→WX}

State whether the following decomposition are


loss-less-join decompositions or not.
(i) R1=(V,W,X) , R2=(V,Y,Z)
(ii) R1=(V,W,X), R2=(X,Y,Z)

Decomposition (i): 𝑅1 = (𝑉, 𝑊, 𝑋), 𝑅2 = (𝑉, 𝑌, 𝑍)

1. Condition 1:
Union of attributes:
𝑅1 ∪ 𝑅2 = (𝑉, 𝑊, 𝑋) ∪ (𝑉, 𝑌, 𝑍) = (𝑉, 𝑊, 𝑋, 𝑌, 𝑍)

✅ Matches 𝑅.

2. Condition 2:
Intersection of attributes:
𝑅1 ∩ 𝑅2 = (𝑉, 𝑊, 𝑋) ∩ (𝑉, 𝑌, 𝑍) = (𝑉)

✅ Not empty.
3. Condition 3:
Common attribute 𝑉 is a key:

In 𝑅1: 𝑉 → 𝑊𝑋 ✅
✅ Lossless join!
Decomposition (ii): 𝑅1 = (𝑉, 𝑊, 𝑋), 𝑅2 = (𝑋, 𝑌, 𝑍)

1. Condition 1:
Union of attributes:

✅ Matches 𝑅.

2. Condition 2:
Intersection of attributes:

✅ Not empty.
3. Condition 3:
Common attribute 𝑋 is not a key in 𝑅1 or 𝑅2:
𝑋 alone doesn’t determine all attributes in 𝑅1 or 𝑅2.

❌ Not lossless join!

Final Answer

1. Decomposition (i): Lossless join ✅


2. Decomposition (ii): Not lossless join ❌

Multivalued Dependencies (MVDs)

A Multivalued Dependency (MVD) exists in a relation when one attribute determines


multiple independent values of another attribute. Unlike functional dependencies (FDs),
where one attribute determines exactly one value of another, MVD allows multiple values.
Notation:
If 𝐴 →→ 𝐵, this means 𝐴 determines all possible combinations of values for 𝐵,
independent of other attributes.
Example:
Consider a relation Student(StuID, Course, Hobby):
A student can have multiple courses.
A student can have multiple hobbies.
These two attributes 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝐻𝑜𝑏𝑏𝑦 are independent of each other.
MVD: 𝑆𝑡𝑢𝐼𝐷 → → 𝐶𝑜𝑢𝑟𝑠𝑒 and 𝑆𝑡𝑢𝐼𝐷 → → 𝐻𝑜𝑏𝑏𝑦
Trivial vs Non-Trivial MVD:

Trivial MVD: If 𝑋 and 𝑌 overlap completely or their union equals the whole table.
Non-Trivial MVD: If 𝑋 and 𝑌 are separate and unrelated.
Example:
Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑇𝑒𝑎𝑐ℎ𝑒𝑟, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡.
Non-Trivial: 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 (when Subject doesn’t overlap with Teacher).

Join Dependency (JD):

What is JD?
JD specifies that a relation 𝑅R can be decomposed into smaller relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​such
that the original relation can be perfectly reconstructed (lossless join).
How is it denoted?
𝐽𝐷(𝑅1 , 𝑅2 , ..., 𝑅𝑛 ): 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​are subsets of 𝑅.
When is JD Trivial?
If any one of the relations 𝑅1 , 𝑅2 , ..., 𝑅𝑛 ​is equal to the entire relation 𝑅, the JD is trivial.

Example of JD:

Consider 𝑅(𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒).


Decompose 𝑅 into:
𝑅1 (𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡) and 𝑅2 (𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒).
Using a lossless join, 𝑅 can be reconstructed:
𝑅 = Π𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 (𝑅1 ) ⋈ Π𝐹𝑎𝑐𝑢𝑙𝑡𝑦, 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 (𝑅2 ).

4NF (Fourth Normal Form)

A table is in 4NF if it is in BCNF (Boyce-Codd Normal Form) and does not have any non-trivial
Multivalued Dependencies (MVDs).
In simple terms, 4NF eliminates redundancy caused by independent multivalued facts.

Example:
Problem:
The Subject and Committee are independent of each other.
𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝑆𝑢𝑏𝑗𝑒𝑐𝑡 and 𝑇𝑒𝑎𝑐ℎ𝑒𝑟 ↠ 𝐶𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 are independent multivalued dependencies.

Decomposition into 4NF:


Teacher and Subject:

Teacher and Committee:

5NF (Fifth Normal Form)

A table is in 5NF if it is in 4NF and cannot be decomposed further without losing data.
5NF deals with Join Dependencies (JD).

Example:

Problem:
This table has a join dependency: The relationship between Student, Course, and Teacher can
be split into three smaller relationships.
Decomposition into 5NF:

Student and Course:

Course and Teacher:

Student and Teacher:

Alternate Approaches to Database Design

Database design typically follows the normalization approach, but there are alternative
methods that may be more appropriate depending on the application's requirements. These
approaches focus on optimizing database performance, reducing redundancy, and ensuring
data integrity, sometimes diverging from strict normalization principles.
1. Denormalization
What it is: Combines multiple tables into one to make data retrieval faster.
When to use: For read-heavy applications where performance is more important than
reducing redundancy.
Example: Instead of separate Customer and Orders tables, combine them into one table
with all details.
Advantage: Faster reads.
Disadvantage: Data redundancy increases.

2. Schema-less Design (NoSQL)


What it is: No fixed structure for the database; uses flexible formats like JSON.
When to use: For unstructured or frequently changing data, like in social media apps.
Example: Store a customer and their orders in a single document.
Advantage: Flexible and scales easily.
Disadvantage: Queries can become complex.

3. Agile Database Design


What it is: Start with a simple design and add features as needed over time.
When to use: In rapidly changing projects like startups.
Example: Begin with just a Users table and add Orders or Addresses tables later.
Advantage: Adapts to new requirements quickly.
Disadvantage: May need rework later.

4. Graph-Based Design
What it is: Stores data as nodes (items) and edges (relationships).
When to use: For highly interconnected data like social networks.
Example: A User node connected to Friends or Posts nodes.
Advantage: Great for relationship-heavy queries.
Disadvantage: Not ideal for standard tabular data.

You might also like