DBMS Module 3
Syllabus
Introduction to database Systems, advantages of database system over traditional
file system, Basic concepts & Definitions, Database users, Database Language,
Database System Architecture, Schemas, Sub Schemas, & Instances, database
constraints, 3-level database architecture, Data Abstraction, Data Independence,
Mappings, Structure, Components & functions of DBMS, Data models.
Entity relationship model, Components of ER model, Mapping E-R model to
Relational schema, Network and Object Oriented Data models. Relational Algebra,
Tuple & Domain Relational Calculus, Relational Query Languages: SQL and QBE.
Query processing and optimization: Evaluation of Relational Algebra Expressions,
Query optimization, Query cost estimation.
Database Design:-Database development life cycle (DDLC), Automated design tools,
Functional dependency and Decomposition, Join strategies, Dependency
Preservation & lossless Design, Normalization, Normal forms:1NF, 2NF,3NF, and
BCNF, Multi-valued Dependencies, 4NF & 5NF.
Transaction processing and concurrency control: Transaction concepts, properties of
transaction, concurrency control, locking and Timestamp methods for concurrency
control schemes. Database Recovery System, Types of Data Base failure & Types of
Database Recovery, Recovery techniques. Fundamental concepts of advanced
databases.
Storage Strategies: Detailed Storage Architecture, RAID
DDLC
Definition: DDLC stands for Database Development Life Cycle.
Purpose: It's a systematic approach for designing, developing, implementing,
and maintaining a database system.
DBMS Module 3 1
Phases:
1. Planning
2. Requirements Analysis
3. Conceptual Design
4. Logical Design
5. Physical Design
6. Implementation
7. Testing
8. Deployment
9. Maintenance
Objective: Create an efficient, well-organized database aligned with
organizational needs.
Key Activities: Planning, stakeholder interviews, entity-relationship modeling,
normalization, physical design, implementation, testing, deployment, and
ongoing maintenance.
1. Planning:
Objective: Establish the foundation for the database project.
Activities:
Define Scope: Clearly outline the boundaries of the database system.
Feasibility Assessment: Evaluate the practicality and viability of the
database project.
Identify Users: Determine the key stakeholders and end-users of the
system.
Project Planning: Develop a project plan outlining tasks, timelines, and
resources.
2. Requirements Analysis:
Objective: Gather detailed information on what the database needs to
accomplish.
DBMS Module 3 2
Activities:
Stakeholder Interviews: Engage with users and stakeholders to understand
their needs.
System Analysis: Analyze existing systems and data sources.
Entity-Relationship Modeling: Define entities, attributes, and relationships.
Business Rule Documentation: Capture rules that govern the structure
and behavior of the data.
3. Conceptual Design:
Objective: Create a high-level abstract representation of the database.
Activities:
Entity-Relationship Diagram (ERD): Develop a visual representation of
entities and their relationships.
Normalization: Organize entities and attributes to minimize redundancy.
Key Identification: Determine primary keys and relationships between
entities.
4. Logical Design:
Objective: Transform the conceptual model into a logical structure for
implementation.
Activities:
Table Mapping: Map entities and relationships to tables in a relational
database.
Attribute Definition: Define attributes and their corresponding data types.
Normalization: Apply normalization techniques to enhance data integrity.
5. Physical Design:
Objective: Specify how the logical design will be implemented on a chosen
Database Management System (DBMS).
Activities:
Storage Structure Selection: Choose appropriate storage structures and
file organization.
DBMS Module 3 3
Indexing: Define indexes to optimize data retrieval.
Performance Optimization: Consider factors like partitioning and clustering
for performance gains.
6. Implementation and Deployment:
Objective: Create the actual database and make it operational.
Activities:
Data Definition Language (DDL): Use DDL to create tables, define
relationships, and establish constraints.
Data Loading: Load initial data into the database.
Testing: Conduct unit and integration testing to ensure data accuracy and
system functionality.
Optimization: Fine-tune the database for performance.
Deployment: Roll out the database to the production environment.
Automated design tools, also known as database design or modeling tools,
streamline the creation and management of databases. Key features include:
1. Entity-Relationship Diagrams (ERD): Generate visual representations of
entities, attributes, and relationships.
2. Data Modeling: Support for creating logical data models, facilitating the
translation from conceptual to logical structures.
3. Normalization Assistance: Aid in applying normalization techniques to enhance
data integrity.
4. Code Generation: Automatically generate Data Definition Language (DDL)
scripts for database creation.
5. Collaboration and Version Control: Enable team collaboration and provide
version control for database designs.
6. Reverse Engineering: Import existing databases to generate visual models
from the existing schema.
DBMS Module 3 4
7. Database Documentation: Generate documentation for the database schema,
enhancing communication and knowledge sharing.
8. Schema Comparison and Synchronization: Compare database schemas and
synchronize changes across development stages.
9. Data Type Suggestions: Provide suggestions for appropriate data types based
on best practices.
10. Data Visualization: Offer graphical representations of data models and
relationships for better understanding.
11. Integration with Development Environments: Seamlessly integrate with
popular development environments and IDEs.
12. Security Modeling: Model and visualize security aspects such as user roles and
access permissions.
These tools contribute to increased efficiency, accuracy, and collaboration in
database design and management.
Functional Dependency:
Definition: In the context of databases, a functional dependency is a relationship
between two sets of attributes in a relation. It describes how the values of one
set of attributes uniquely determine the values of another set.
Notation: If A and B are attributes in a relation, and every value of A uniquely
determines the value of B, it is denoted as A → B.
Example: In a relation with attributes {EmployeeID, FirstName, LastName}, if
EmployeeID → FirstName, it means that each EmployeeID uniquely determines
the corresponding FirstName.
Purpose: Functional dependencies are crucial for database normalization,
ensuring data integrity and reducing redundancy by organizing data in a way that
avoids unnecessary duplication.
Decomposition:
Definition: Decomposition in database design refers to the process of breaking
down a relation with a certain set of attributes into multiple smaller relations. This
is often done to achieve a higher normal form and improve data integrity.
Steps:
DBMS Module 3 5
1. Identify Functional Dependencies: Determine the functional dependencies
present in the relation.
2. Normalize the Relation: Use normalization techniques to eliminate partial
and transitive dependencies.
3. Decompose the Relation: Create new relations based on the identified
functional dependencies.
Example: Consider a relation R(ABCD) with the functional dependencies A → B
and C → D. The relation can be decomposed into two relations: R1(ACD) and
R2(AB).
Purpose: Decomposition helps in organizing data more efficiently, reducing
redundancy, and avoiding update anomalies by adhering to higher normal forms.
Connection:
Functional Dependency and Decomposition: The identification of functional
dependencies is a prerequisite for decomposition. Understanding how attributes
depend on each other allows for the creation of smaller, well-organized relations
that adhere to normalization principles.
Example Connection: If a relation R(ABC) has a functional dependency A → B,
it may be decomposed into R1(A) and R2(BC). This decomposition ensures that
information is stored without redundancy and follows the principles of
normalization.
Join Strategies in SQL:
1. Equi Join:
Definition: An equi join is a specific type of inner join where the condition for
matching rows is based on the equality of values in specified columns.
Syntax:
SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column;
Example:
SELECT * FROM employees INNER JOIN departments ON employees.department_id = depart
ments.department_id;
DBMS Module 3 6
2. Theta Join:
Definition: A theta join is a generalization of the equi join, where the condition
for matching rows involves any comparison operator (other than equality).
Syntax:
SELECT * FROM table1 JOIN table2 ON table1.column > table2.column;
Example:
SELECT * FROM orders JOIN customers ON orders.customer_id = customers.customer_id
AND orders.order_date > '2023-01-01';
3. Cross Join:
Definition: A cross join, or Cartesian join, returns the Cartesian product of two
tables, resulting in all possible combinations of rows from both tables.
Syntax:
SELECT * FROM table1 CROSS JOIN table2;
Example:
SELECT * FROM employees CROSS JOIN departments;
4. Natural Join:
Definition: A natural join returns rows from two tables with matching values in
columns with the same name. It automatically identifies and joins columns with
identical names.
Syntax:
SELECT * FROM table1 NATURAL JOIN table2;
Example:
DBMS Module 3 7
SELECT * FROM employees NATURAL JOIN departments;
5. Inner Join:
Definition: An inner join returns only the rows that have matching values in the
specified columns of both tables.
Syntax:
SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column;
Example:
SELECT * FROM employees INNER JOIN departments ON employees.department_id = depart
ments.department_id;
6. Outer Join (Left, Right, Full):
Definition: Outer joins return all rows from one table and the matching rows
from the other table. Types include:
Left Outer Join (or Left Join): Returns all rows from the left table and
matching rows from the right table.
Right Outer Join (or Right Join): Returns all rows from the right table and
matching rows from the left table.
Full Outer Join (or Full Join): Returns all rows when there is a match in
either the left or right table.
Syntax:
-- Left Outer Join
SELECT * FROM table1 LEFT OUTER JOIN table2 ON table1.column = table2.column;
-- Right Outer Join
SELECT * FROM table1 RIGHT OUTER JOIN table2 ON table1.column = table2.column;
-- Full Outer Join
SELECT * FROM table1 FULL OUTER JOIN table2 ON table1.column = table2.column;
Example:
DBMS Module 3 8
-- Left Outer Join
SELECT * FROM employees LEFT OUTER JOIN departments ON employees.department_id = d
epartments.department_id;
-- Right Outer Join
SELECT * FROM employees RIGHT OUTER JOIN departments ON employees.department_id =
departments.department_id;
-- Full Outer Join
SELECT * FROM employees FULL OUTER JOIN departments ON employees.department_id = d
epartments.department_id;
Additional Notes:
Performance Considerations:
Cross joins can result in large result sets and are often used with caution.
Natural joins rely on column names and may lead to unexpected results if
schema changes occur.
Null Values:
Outer joins can introduce null values in columns where there is no match.
Choosing the Right Join:
Select the appropriate join based on the desired outcome and the
relationship between tables.
Dependency Preservation:
*1. Functional Dependency (FD):
Definition: Functional dependency is a relationship between two sets of
attributes in a relation, where the values of one set uniquely determine the
values of another set.
Example: If A → B, it means knowing the value of A uniquely determines the
value of B.
*2. Dependency Preservation:
Definition: In the context of database design, dependency preservation refers to
the property of maintaining all functional dependencies from the original relation
after decomposition.
DBMS Module 3 9
Importance: Ensures that the information implied by functional dependencies is
not lost during the decomposition process.
Example: If a relation R(A, B, C) has a functional dependency A → B, and R is
decomposed into R1(A, B) and R2(A, C), dependency preservation requires that
the FD A → B is still maintained in the decomposed relations.
Lossless Design:
*1. Lossless Join Decomposition:
Definition: A design is considered lossless if it allows the reconstruction of the
original relation through a join operation without losing any information.
Criteria: A decomposition is lossless if, for any instance of the original relation,
joining the decomposed relations produces the same result as joining the original
relation.
Example: If relation R is decomposed into R1 and R2, the join of R1 and R2
should result in the original relation R.
*2. Lossless Decomposition Algorithms:
BCNF Decomposition: If a relation is decomposed into Boyce-Codd Normal
Form (BCNF), the decomposition is guaranteed to be lossless.
Third Normal Form (3NF) Decomposition: A 3NF decomposition may or may
not be lossless. Additional steps or checks are required to ensure losslessness.
*3. Chasing Algorithm:
Definition: Chasing is an algorithmic approach to check whether a
decomposition is lossless.
Procedure: It involves "chasing" attributes through the decomposition to see if
they can be reconstructed.
Example: If relation R(A, B, C) is decomposed into R1(A, B) and R2(B, C), the
chasing algorithm checks if attributes in R can be reconstructed from R1 and R2.
Connection between Dependency Preservation and Lossless
Design:
Dependency Preservation and Lossless Design: Ensuring dependency
preservation during decomposition contributes to lossless design.
DBMS Module 3 10
Example: If functional dependencies are preserved during decomposition, it is
more likely that the decomposition is lossless because dependencies provide
constraints that help maintain the integrity of the data.
Importance: The combined goal is to create decomposed relations that not only
adhere to normalization principles but also retain the original information and
dependencies.
Achieving Both: Techniques like BCNF decomposition, which ensures lossless
join, often also contribute to dependency preservation.
Normalization in Database Design:
1. Definition:
Normalization is the systematic process of organizing a relational database to
reduce data redundancy and dependency by organizing data into related tables.
2. Objectives:
Minimize data redundancy.
Reduce data dependency.
Improve data integrity.
Normal Forms:
1. First Normal Form (1NF):
Definition: Ensures that each column in a table contains atomic (indivisible)
values.
Example: If a table has a column with a list of courses, it should be normalized
by creating a separate table for courses.
2. Second Normal Form (2NF):
Definition: Requires 1NF and ensures that non-prime attributes are fully
functionally dependent on the primary key.
Example: If a composite primary key consists of (A, B), and C is dependent on A
but not on B, C should be moved to a table with (A, C).
3. Third Normal Form (3NF):
DBMS Module 3 11
Definition: Requires 2NF and ensures that no transitive dependencies exist, i.e.,
non-prime attributes are not dependent on other non-prime attributes.
Example: If A → B and B → C, then A → C should be eliminated.
4. Boyce-Codd Normal Form (BCNF):
Definition: Requires 3NF and ensures that every non-prime attribute is
functionally dependent on the primary key.
Example: If A → B, A → C, but B and C are not dependent on each other, a
separate table should be created for (A, B) and (A, C).
Multi-valued Dependencies:
1. Definition:
A multi-valued dependency exists when a value in one column uniquely
determines values in another column, independent of other attributes.
2. Fourth Normal Form (4NF):
Definition: Addresses multi-valued dependencies by ensuring that there are no
non-trivial multi-valued dependencies on a candidate key.
Example: If A →→ B (A determines a set of values for B), then A should be
separated from B.
Fifth Normal Form (5NF):
1. Definition:
Fifth Normal Form (5NF) addresses cases where a table contains join
dependencies between overlapping candidate keys.
2. Key Points:
It deals with situations where a table has multiple overlapping candidate keys.
The goal is to eliminate join dependencies between these keys.
Considerations:
1. Trade-offs:
Normalization involves trade-offs between redundancy reduction and query
performance.
2. Denormalization:
DBMS Module 3 12
In some cases, denormalization is applied to optimize query performance by
reintroducing redundancy carefully.
3. Application:
The level of normalization depends on the specific requirements and use cases
of the application.
DBMS Module 3 13