KEMBAR78
DBMS: Week 10 - Database Design and Normalization | PPTX
International Islamic University H-10, Islamabad, Pakistan
Database Managements Systems
Week 10
Database Design and
Normalization
Engr. Rashid Farid Chishti
http://youtube.com/rfchishti
http://sites.google.com/site/chisht
i
 Understand the importance of database design in relational databases.
 Learn about functional dependencies and their role in normalization.
 Understand the concepts of First Normal Form (1NF) and Second Normal Form
(2NF).
 Apply normalization techniques to reduce data redundancy and improve
database design.
Learning Objectives
 Definition: Database design is the process of defining the structure,
relationships, and integrity constraints for a database.
 Importance:
 Ensures efficient data storage and retrieval.
 Reduces redundancy and avoids data anomalies.
 Improves data integrity and consistency.
Introduction to Database Design
 Definition: Normalization is the process of organizing data in a database to
minimize redundancy and dependency by dividing large tables into smaller,
related tables.
 Goal: To design the database in such a way that it eliminates undesirable
characteristics like insertion, update, and deletion anomalies.
Normalization Levels:
 Several forms, each building upon the previous one:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 Boyce-Codd Normal Form (BCNF)
What is Normalization?
 Definition: A relation (table) is in 1NF if:
 All attributes (columns) contain atomic (indivisible) values.
 Each column contains values of a single type.
 Each column has a unique name.
 The order of rows and columns does not matter.
 Objective:
 Eliminate repeating groups and multi-valued attributes
First Normal Form (1NF)
Table: Student (Now in 1NF)
ID Name Courses
1 Alice Math
1 Alice Science
2 Bob English
3 David Math
3 David English
Table: Student (Not in 1NF)
ID Name Courses
1 Alice Math, Science
2 Bob English
3 David Math, English
Table: Employee (Not in 1NF)
ID Name Phone
1 Alice 0301-1234567
0302-9876543
2 Bob 0303-5678901
3 David 0304-2468135
0305-1357924
Table: Employee (Now in 1NF)
ID Name Phone
1 Alice 0301-1234567
1 Alice 0302-9876543
2 Bob 0303-5678901
3 David 0304-2468135
3 David 0305-1357924
 Problems:
 The "Authors" column
has multiple values
(John, Alice).
 The “Category" column has
multiple values (AI, Machine Learning).
Example 2: First Normal Form (1NF)
BOOK_ID Book_Title Authors Book_Category
1 Database Systems John, Alice Technology
2 Python Basics Bob, Carol Programming
3 AI Foundations David, Eve AI, Machine Learning
Table: Book (Not in 1NF)
ID Book_Title Authors Book_Category
1 Database Systems John Technology
1 Database Systems Alice Technology
2 Python Basics Bob Programming
2 Python Basics Carol Programming
3 AI Foundations David AI
3 AI Foundations David Machine Learning
3 AI Foundations Eve AI
3 AI Foundations Eve Machine Learning
Table: Book (Now in 1NF)
 Definition:
A relation is in Second Normal Form (2NF) if:
 It is in 1NF.
 It has no partial dependencies (i.e., no non-key attribute
is dependent on only part of a composite primary key).
 Here non-key attributes Department and Instructor depends
only on Course, not on the full primary key (Student_ID, Course).
(Course → Instructor, Department)
Second Normal Form (2NF)
Table: Student_Course (Not in 2NF)
Student_ID Course Instructor Department
1 Math Dr. A Science
1 Science Dr. B Science
Table: Student_Course
Student_ID Course
1 Math
1 Science
Table: Course_Details
Course Instructor Department
Math Dr. A Science
Science Dr. B Science
 Here, Emp_ID is the primary key.
 Here, Deptment Location depends on Department, not on Emp_ID.
Example 2: Second Normal Form (2NF)
Table: Employee (Not in 2NF)
Emp_ID Name Department Dept_Location
1 John IT Lahore
2 Alice HR Karachi
3 Bob IT Lahore
Table: Department
Department Location
IT Lahore
HR Karachi
Table: Employee
ID Name Department
1 John IT
2 Alice HR
3 Bob IT
 Here, Book ID is the primary key.
 The non-prime (non key) attribute Author does not depend Book ID.
 Also Book Category also does not depend in Book ID.
 Only Book Title depends on Book ID.
Example 3: Second Normal Form (2NF)
ID Book_Title Authors Book_Category
1 Database Systems John Technology
1 Database Systems Alice Technology
2 Python Basics Bob Programming
2 Python Basics Carol Programming
3 AI Foundations David AI
3 AI Foundations David Machine Learning
3 AI Foundations Eve AI
3 AI Foundations Eve Machine Learning
Table: Book (in 1NF)
ID Book_Title
1 Database Systems
2 Python Basics
3 AI Foundations
3 AI Foundations
Table: Book (in 2NF)
Book_ID Authors
1 John
1 Alice
2 Bob
2 Carol
3 David
3 Eve
Table: Author
BOOK_ID Category
1 Technology
2 Programming
3 AI
3 Machine Learning
Table: Category
 Partial Dependencies:
 Course_Name depends on Course_ID
 Instructor_Name depends on Instructor_ID
 Dept depends on Instructor_ID
 Solution:
 Split into separate tables.
Example 4: Second Normal Form (2NF)
Table: Enrolment (Not in 2NF)
Student_ID Student_Name Course_ID Course_Name Instructor_ID Instructor_Name Dept
S1 Alice C101 DBMS I1 Prof. Smith CS
S2 Bob C102 AI I2 Prof. Lee AI
S3 Charlie C101 DBMS I1 Prof. Smith CS
S4 Dave C103 Networks I3 Prof. Green Networks
 Now it is in 2NF
Example 4: Second Normal Form (2NF)
Table: Student
ID Name
S1 Alice
S2 Bob
S3 Charlie
S4 Dave
Table: Course
ID Name
C101 DBMS
C102 AI
C101 DBMS
C103 Networks
Student_ID Course_ID Instructor_ID
S1 C101 I1
S2 C102 I2
S3 C101 I1
S4 C103 I3
ID Name Dept
I1 Prof. Smith CS
I2 Prof. Lee AI
I3 Prof. Green Networks
Table: Instructor
Table: Student_Course_Instructor
 Definition: A functional dependency (FD) is a relationship between two
attributes (or sets of attributes) in a table.
 Notation: A → B means that attribute A uniquely determines attribute B.
 Example:
 If student_id → student_name, then for each student_id, there is a unique
student_name.
 Importance: Understanding functional dependencies is crucial for
normalization, as they help identify relationships between data attributes.
Functional Dependencies
 Definition of 3NF: A table is in Third Normal Form (3NF) if:
 It is in Second Normal Form (2NF) (i.e., no partial dependencies).
 No transitive dependencies exist (i.e.,).
 Transitive Dependency: Occurs when a no non-key attribute depends on
another non-key attribute.
 Example:
 A → B (Where A is a primary key)
 B → C (Where B is Non key attribute)
 This means A indirectly determines C, which is a transitive dependency.
 Thus A → C is a transitive dependency.
 Objective: Remove transitive dependencies to ensure that non-key
attributes depend only on the key-attribute (primary key).
Third Normal Form (3NF)
 Primary Key (PK): Student_ID
 Department_Name depends on Department_ID, not directly on Student_ID.
 Transitive Dependency:
 Student_ID → Department_ID (Direct Dependency)
 Department_ID → Department_Name (Indirect Dependency)
 Therefore, Student_ID → Department_Name (Transitive Dependency)
 To remove transitive dependency, we create a separate Department Table
Example 1: Third Normal Form (3NF)
Student_ID (PK) Student_Name Department_ID Department_Name
101 Alice D01 Computer Science
102 Bob D02 Electrical Engineering
103 Charlie D01 Computer Science
 Now, there is no transitive dependency because Department_Name depends only on
Department_ID.
Example 1: Third Normal Form (3NF)
ID (PK) Name Deptment_ID (FK)
101 Alice D01
102 Bob D02
103 Charlie D01
ID (PK) Name
D01 Computer Science
D02 Electrical Engineering
Table: Student Table: Department
 Definition: A relation is in Boyce-Codd Normal Form (BCNF) if:
 It is already in 3NF.
 For every functional dependency (X → Y) in the relation, the X on left-hand
side (determinant) is a superkey (A set of attributes that uniquely identifies
each record in a table).
 Problem with 3NF: A relation can still have a non-superkey determinant.
 Therefore, to check for BCNF, we simply identify all the determinants and
verify that they are superkeys.
 If they are not, we break up the relation by projection until we have a set of
relations all in BCNF.
Boyce-Codd Normal Form (BCNF)
 Consider the following relation.
 Functional Dependencies:
 (Student_ID, Course_ID) → Instructor
 Since an instructor teaches a specific course
 Course_ID → Instructor
 Each course is taught by only one instructor.
 Here, Course_ID is not a superkey, but it determines Instructor, violating BCNF.
 Converting to BCNF: To achieve BCNF, we will decompose the relation into:
 Course Table (Ensuring that each course has only one instructor)
 Student-Course Table
 Mapping students to courses
Example 1: Boyce-Codd Normal Form (BCNF)
Student_ID Course_ID Instructor
1 DBMS Dr. A
2 OS Dr. B
3 DBMS Dr. A
4 OS Dr. B
Course Instructor
DBMS Dr. A
OS Dr. B
Student_ID Course_ID
1 DBMS
2 OS
3 DBMS
4 OS
Table: Course_Instructor
Table: Student_Course
 Functional Dependencies
 (Supplier_ID, Part_ID) → Supplier_City, Part_Category
 Each supplier supplies a specific part and has a fixed city.
 Supplier_ID → Supplier_City
 A supplier is based in only one city.
 Part_ID → Part_Category
 A part belongs to only one category.
 Issue
 Supplier_ID → Supplier_City
 violates BCNF because Supplier_ID is not a superkey.
 Part_ID → Part_Category
 violates BCNF because Part_ID is not a superkey.
Example 2: Boyce-Codd Normal Form (BCNF)
Supplier_ID Part_ID Supplier_City Part_Category
1 101 Karachi Electronics
2 102 Lahore Machanical
1 103 Karachi Electrical
Table: Supplier_Parts
 To achieve BCNF, we will decompose the relation into 3 tables:
 Now, all tables are in BCNF because all functional dependencies have superkeys on the left.
Example 2: Boyce-Codd Normal Form (BCNF)
Part_ID Part_Category
101 Electronics
102 Machanical
103 Electrical
Table: Parts
Supplier_ID Supplier_City
1 Karachi
2 Lahore
Table: Supplier
Supplier_ID Part_ID
1 101
2 102
1 103
Table: Supplier_Parts
 3NF: A relation is in 3NF if it has no transitive dependencies, but there could
still be functional dependencies where the determinant is not a superkey.
 BCNF: A relation is in BCNF if every determinant is a superkey.
 Key Difference: BCNF is a stricter form of 3NF. Every relation in BCNF is in 3NF,
but not every relation in 3NF is in BCNF.
Difference Between 3NF and BCNF
Recap of Normalization
Normalization Main Rule Problem Solved
1NF
No duplicate columns, atomic
values
Removes multi-valued attributes
2NF
1NF + No partial dependencies Eliminates dependency on part of a
composite key
3NF
2NF + No transitive dependencies Eliminates non-key dependencies
BCNF
3NF + Every determinant is a
superkey
Stronger than 3NF, removes
remaining anomalies
 Benefits of Normalization:
 Reduces Data Redundancy: Eliminates duplicate data by dividing data into
smaller tables.
 Improves Data Integrity: Ensures consistency by minimizing the chances of
anomalies.
 Easier Updates: Easier to update data without causing inconsistencies.
 Drawbacks:
 More complex queries (due to more joins).
 Potential performance impact in some cases (especially with excessive
normalization).
Why Normalize ?
 Denormalization is the process of intentionally introducing redundancy into a
database to improve read performance by reducing the need for complex
joins. It involves combining tables or storing redundant data to optimize query
execution.
 Why Denormalization?
 Faster Read Queries: Reduces the number of joins needed in SELECT
queries.
 Improves Performance: Useful for OLAP (Online Analytical Processing)
systems and reporting.
 Reduces Computational Overhead: Joins and aggregations can be expensive
in large datasets.
Denormalization
 Common Denormalization Techniques:
 Precomputed Aggregations: Storing calculated values (e.g., total sales per
month) instead of computing them on the fly.
 Duplicating Data: Storing frequently accessed columns in multiple tables to
reduce joins.
 Adding Derived Columns: Keeping derived data like full names (FirstName +
LastName) instead of computing it at runtime.
 Using Composite Tables: Creating summary tables for frequently queried
reports.
Denormalization
 Drawbacks of Denormalization:
 Increased Storage: Data redundancy leads to more storage consumption.
 Data Inconsistency: Updates and deletions must be carefully managed to
avoid inconsistencies.
 More Complex Write Operations: Insert, update, and delete operations may
require changes in multiple places.
 When to Use Denormalization?
 When database reads are significantly more frequent than writes.
 In data warehousing and reporting systems where query performance is
critical.
 When optimizing NoSQL databases, where joins are costly.
Denormalization
 Advantages:
 Improved query performance (fewer joins).
 Faster retrieval of complex data relationships.
 Disadvantages:
 Increased storage requirements due to redundancy.
 More complex update operations, leading to potential inconsistencies.
 When to Use:
 Denormalization is typically used in reporting systems or when querying
performance becomes critical.
Pros and Cons of Denormalization
 Normalized Schema (3rd Normal Form - 3NF)
 Customers (CustomerID, Name, Email, Address)
 Orders (OrderID, CustomerID, OrderDate)
 OrderDetails (OrderDetailID, OrderID, ProductID, Quantity, Price)
 Products (ProductID, ProductName, Category, Price)
 👉 Every time we need to display order details, we must join Orders, OrderDetails,
Products, and Customers, which can slow down performance.
 To optimize query performance, we can denormalize by adding redundant data to avoid
excessive joins. We introduce a single table called OrdersSummary.
 Faster Read Query:
Denormalization Example
OrderID CustomerID CustomerName ProductID ProductName OrderDate Quantity Price TotalPrice
101 1 John Doe 501 Laptop 2024-03-10 1 1000 1000
101 1 John Doe 502 Mouse 2024-03-10 2 20 40
SELECT * FROM OrdersSummary WHERE CustomerID
= 1;

DBMS: Week 10 - Database Design and Normalization

  • 1.
    International Islamic UniversityH-10, Islamabad, Pakistan Database Managements Systems Week 10 Database Design and Normalization Engr. Rashid Farid Chishti http://youtube.com/rfchishti http://sites.google.com/site/chisht i
  • 2.
     Understand theimportance of database design in relational databases.  Learn about functional dependencies and their role in normalization.  Understand the concepts of First Normal Form (1NF) and Second Normal Form (2NF).  Apply normalization techniques to reduce data redundancy and improve database design. Learning Objectives
  • 3.
     Definition: Databasedesign is the process of defining the structure, relationships, and integrity constraints for a database.  Importance:  Ensures efficient data storage and retrieval.  Reduces redundancy and avoids data anomalies.  Improves data integrity and consistency. Introduction to Database Design
  • 4.
     Definition: Normalizationis the process of organizing data in a database to minimize redundancy and dependency by dividing large tables into smaller, related tables.  Goal: To design the database in such a way that it eliminates undesirable characteristics like insertion, update, and deletion anomalies. Normalization Levels:  Several forms, each building upon the previous one:  First Normal Form (1NF)  Second Normal Form (2NF)  Third Normal Form (3NF)  Boyce-Codd Normal Form (BCNF) What is Normalization?
  • 5.
     Definition: Arelation (table) is in 1NF if:  All attributes (columns) contain atomic (indivisible) values.  Each column contains values of a single type.  Each column has a unique name.  The order of rows and columns does not matter.  Objective:  Eliminate repeating groups and multi-valued attributes First Normal Form (1NF) Table: Student (Now in 1NF) ID Name Courses 1 Alice Math 1 Alice Science 2 Bob English 3 David Math 3 David English Table: Student (Not in 1NF) ID Name Courses 1 Alice Math, Science 2 Bob English 3 David Math, English Table: Employee (Not in 1NF) ID Name Phone 1 Alice 0301-1234567 0302-9876543 2 Bob 0303-5678901 3 David 0304-2468135 0305-1357924 Table: Employee (Now in 1NF) ID Name Phone 1 Alice 0301-1234567 1 Alice 0302-9876543 2 Bob 0303-5678901 3 David 0304-2468135 3 David 0305-1357924
  • 6.
     Problems:  The"Authors" column has multiple values (John, Alice).  The “Category" column has multiple values (AI, Machine Learning). Example 2: First Normal Form (1NF) BOOK_ID Book_Title Authors Book_Category 1 Database Systems John, Alice Technology 2 Python Basics Bob, Carol Programming 3 AI Foundations David, Eve AI, Machine Learning Table: Book (Not in 1NF) ID Book_Title Authors Book_Category 1 Database Systems John Technology 1 Database Systems Alice Technology 2 Python Basics Bob Programming 2 Python Basics Carol Programming 3 AI Foundations David AI 3 AI Foundations David Machine Learning 3 AI Foundations Eve AI 3 AI Foundations Eve Machine Learning Table: Book (Now in 1NF)
  • 7.
     Definition: A relationis in Second Normal Form (2NF) if:  It is in 1NF.  It has no partial dependencies (i.e., no non-key attribute is dependent on only part of a composite primary key).  Here non-key attributes Department and Instructor depends only on Course, not on the full primary key (Student_ID, Course). (Course → Instructor, Department) Second Normal Form (2NF) Table: Student_Course (Not in 2NF) Student_ID Course Instructor Department 1 Math Dr. A Science 1 Science Dr. B Science Table: Student_Course Student_ID Course 1 Math 1 Science Table: Course_Details Course Instructor Department Math Dr. A Science Science Dr. B Science
  • 8.
     Here, Emp_IDis the primary key.  Here, Deptment Location depends on Department, not on Emp_ID. Example 2: Second Normal Form (2NF) Table: Employee (Not in 2NF) Emp_ID Name Department Dept_Location 1 John IT Lahore 2 Alice HR Karachi 3 Bob IT Lahore Table: Department Department Location IT Lahore HR Karachi Table: Employee ID Name Department 1 John IT 2 Alice HR 3 Bob IT
  • 9.
     Here, BookID is the primary key.  The non-prime (non key) attribute Author does not depend Book ID.  Also Book Category also does not depend in Book ID.  Only Book Title depends on Book ID. Example 3: Second Normal Form (2NF) ID Book_Title Authors Book_Category 1 Database Systems John Technology 1 Database Systems Alice Technology 2 Python Basics Bob Programming 2 Python Basics Carol Programming 3 AI Foundations David AI 3 AI Foundations David Machine Learning 3 AI Foundations Eve AI 3 AI Foundations Eve Machine Learning Table: Book (in 1NF) ID Book_Title 1 Database Systems 2 Python Basics 3 AI Foundations 3 AI Foundations Table: Book (in 2NF) Book_ID Authors 1 John 1 Alice 2 Bob 2 Carol 3 David 3 Eve Table: Author BOOK_ID Category 1 Technology 2 Programming 3 AI 3 Machine Learning Table: Category
  • 10.
     Partial Dependencies: Course_Name depends on Course_ID  Instructor_Name depends on Instructor_ID  Dept depends on Instructor_ID  Solution:  Split into separate tables. Example 4: Second Normal Form (2NF) Table: Enrolment (Not in 2NF) Student_ID Student_Name Course_ID Course_Name Instructor_ID Instructor_Name Dept S1 Alice C101 DBMS I1 Prof. Smith CS S2 Bob C102 AI I2 Prof. Lee AI S3 Charlie C101 DBMS I1 Prof. Smith CS S4 Dave C103 Networks I3 Prof. Green Networks
  • 11.
     Now itis in 2NF Example 4: Second Normal Form (2NF) Table: Student ID Name S1 Alice S2 Bob S3 Charlie S4 Dave Table: Course ID Name C101 DBMS C102 AI C101 DBMS C103 Networks Student_ID Course_ID Instructor_ID S1 C101 I1 S2 C102 I2 S3 C101 I1 S4 C103 I3 ID Name Dept I1 Prof. Smith CS I2 Prof. Lee AI I3 Prof. Green Networks Table: Instructor Table: Student_Course_Instructor
  • 12.
     Definition: Afunctional dependency (FD) is a relationship between two attributes (or sets of attributes) in a table.  Notation: A → B means that attribute A uniquely determines attribute B.  Example:  If student_id → student_name, then for each student_id, there is a unique student_name.  Importance: Understanding functional dependencies is crucial for normalization, as they help identify relationships between data attributes. Functional Dependencies
  • 13.
     Definition of3NF: A table is in Third Normal Form (3NF) if:  It is in Second Normal Form (2NF) (i.e., no partial dependencies).  No transitive dependencies exist (i.e.,).  Transitive Dependency: Occurs when a no non-key attribute depends on another non-key attribute.  Example:  A → B (Where A is a primary key)  B → C (Where B is Non key attribute)  This means A indirectly determines C, which is a transitive dependency.  Thus A → C is a transitive dependency.  Objective: Remove transitive dependencies to ensure that non-key attributes depend only on the key-attribute (primary key). Third Normal Form (3NF)
  • 14.
     Primary Key(PK): Student_ID  Department_Name depends on Department_ID, not directly on Student_ID.  Transitive Dependency:  Student_ID → Department_ID (Direct Dependency)  Department_ID → Department_Name (Indirect Dependency)  Therefore, Student_ID → Department_Name (Transitive Dependency)  To remove transitive dependency, we create a separate Department Table Example 1: Third Normal Form (3NF) Student_ID (PK) Student_Name Department_ID Department_Name 101 Alice D01 Computer Science 102 Bob D02 Electrical Engineering 103 Charlie D01 Computer Science
  • 15.
     Now, thereis no transitive dependency because Department_Name depends only on Department_ID. Example 1: Third Normal Form (3NF) ID (PK) Name Deptment_ID (FK) 101 Alice D01 102 Bob D02 103 Charlie D01 ID (PK) Name D01 Computer Science D02 Electrical Engineering Table: Student Table: Department
  • 16.
     Definition: Arelation is in Boyce-Codd Normal Form (BCNF) if:  It is already in 3NF.  For every functional dependency (X → Y) in the relation, the X on left-hand side (determinant) is a superkey (A set of attributes that uniquely identifies each record in a table).  Problem with 3NF: A relation can still have a non-superkey determinant.  Therefore, to check for BCNF, we simply identify all the determinants and verify that they are superkeys.  If they are not, we break up the relation by projection until we have a set of relations all in BCNF. Boyce-Codd Normal Form (BCNF)
  • 17.
     Consider thefollowing relation.  Functional Dependencies:  (Student_ID, Course_ID) → Instructor  Since an instructor teaches a specific course  Course_ID → Instructor  Each course is taught by only one instructor.  Here, Course_ID is not a superkey, but it determines Instructor, violating BCNF.  Converting to BCNF: To achieve BCNF, we will decompose the relation into:  Course Table (Ensuring that each course has only one instructor)  Student-Course Table  Mapping students to courses Example 1: Boyce-Codd Normal Form (BCNF) Student_ID Course_ID Instructor 1 DBMS Dr. A 2 OS Dr. B 3 DBMS Dr. A 4 OS Dr. B Course Instructor DBMS Dr. A OS Dr. B Student_ID Course_ID 1 DBMS 2 OS 3 DBMS 4 OS Table: Course_Instructor Table: Student_Course
  • 18.
     Functional Dependencies (Supplier_ID, Part_ID) → Supplier_City, Part_Category  Each supplier supplies a specific part and has a fixed city.  Supplier_ID → Supplier_City  A supplier is based in only one city.  Part_ID → Part_Category  A part belongs to only one category.  Issue  Supplier_ID → Supplier_City  violates BCNF because Supplier_ID is not a superkey.  Part_ID → Part_Category  violates BCNF because Part_ID is not a superkey. Example 2: Boyce-Codd Normal Form (BCNF) Supplier_ID Part_ID Supplier_City Part_Category 1 101 Karachi Electronics 2 102 Lahore Machanical 1 103 Karachi Electrical Table: Supplier_Parts
  • 19.
     To achieveBCNF, we will decompose the relation into 3 tables:  Now, all tables are in BCNF because all functional dependencies have superkeys on the left. Example 2: Boyce-Codd Normal Form (BCNF) Part_ID Part_Category 101 Electronics 102 Machanical 103 Electrical Table: Parts Supplier_ID Supplier_City 1 Karachi 2 Lahore Table: Supplier Supplier_ID Part_ID 1 101 2 102 1 103 Table: Supplier_Parts
  • 20.
     3NF: Arelation is in 3NF if it has no transitive dependencies, but there could still be functional dependencies where the determinant is not a superkey.  BCNF: A relation is in BCNF if every determinant is a superkey.  Key Difference: BCNF is a stricter form of 3NF. Every relation in BCNF is in 3NF, but not every relation in 3NF is in BCNF. Difference Between 3NF and BCNF
  • 21.
    Recap of Normalization NormalizationMain Rule Problem Solved 1NF No duplicate columns, atomic values Removes multi-valued attributes 2NF 1NF + No partial dependencies Eliminates dependency on part of a composite key 3NF 2NF + No transitive dependencies Eliminates non-key dependencies BCNF 3NF + Every determinant is a superkey Stronger than 3NF, removes remaining anomalies
  • 22.
     Benefits ofNormalization:  Reduces Data Redundancy: Eliminates duplicate data by dividing data into smaller tables.  Improves Data Integrity: Ensures consistency by minimizing the chances of anomalies.  Easier Updates: Easier to update data without causing inconsistencies.  Drawbacks:  More complex queries (due to more joins).  Potential performance impact in some cases (especially with excessive normalization). Why Normalize ?
  • 23.
     Denormalization isthe process of intentionally introducing redundancy into a database to improve read performance by reducing the need for complex joins. It involves combining tables or storing redundant data to optimize query execution.  Why Denormalization?  Faster Read Queries: Reduces the number of joins needed in SELECT queries.  Improves Performance: Useful for OLAP (Online Analytical Processing) systems and reporting.  Reduces Computational Overhead: Joins and aggregations can be expensive in large datasets. Denormalization
  • 24.
     Common DenormalizationTechniques:  Precomputed Aggregations: Storing calculated values (e.g., total sales per month) instead of computing them on the fly.  Duplicating Data: Storing frequently accessed columns in multiple tables to reduce joins.  Adding Derived Columns: Keeping derived data like full names (FirstName + LastName) instead of computing it at runtime.  Using Composite Tables: Creating summary tables for frequently queried reports. Denormalization
  • 25.
     Drawbacks ofDenormalization:  Increased Storage: Data redundancy leads to more storage consumption.  Data Inconsistency: Updates and deletions must be carefully managed to avoid inconsistencies.  More Complex Write Operations: Insert, update, and delete operations may require changes in multiple places.  When to Use Denormalization?  When database reads are significantly more frequent than writes.  In data warehousing and reporting systems where query performance is critical.  When optimizing NoSQL databases, where joins are costly. Denormalization
  • 26.
     Advantages:  Improvedquery performance (fewer joins).  Faster retrieval of complex data relationships.  Disadvantages:  Increased storage requirements due to redundancy.  More complex update operations, leading to potential inconsistencies.  When to Use:  Denormalization is typically used in reporting systems or when querying performance becomes critical. Pros and Cons of Denormalization
  • 27.
     Normalized Schema(3rd Normal Form - 3NF)  Customers (CustomerID, Name, Email, Address)  Orders (OrderID, CustomerID, OrderDate)  OrderDetails (OrderDetailID, OrderID, ProductID, Quantity, Price)  Products (ProductID, ProductName, Category, Price)  👉 Every time we need to display order details, we must join Orders, OrderDetails, Products, and Customers, which can slow down performance.  To optimize query performance, we can denormalize by adding redundant data to avoid excessive joins. We introduce a single table called OrdersSummary.  Faster Read Query: Denormalization Example OrderID CustomerID CustomerName ProductID ProductName OrderDate Quantity Price TotalPrice 101 1 John Doe 501 Laptop 2024-03-10 1 1000 1000 101 1 John Doe 502 Mouse 2024-03-10 2 20 40 SELECT * FROM OrdersSummary WHERE CustomerID = 1;