The Three-Level Architecture is a framework used in database systems to separate the
user's view, the physical storage of data, and the overall logical structure. It is defined by the
ANSI/SPARC (American National Standards Institute/Standards Planning and
Requirements Committee) and includes the following three levels:
1. Internal Level (Physical Level)
Description: Deals with the physical storage of data on hardware.
Focus: How the data is actually stored — e.g., indexes, file structures, and access
paths.
Users: Database administrators (DBAs) and system designers.
Example: Data stored in a binary file with indexing for fast retrieval.
2. Conceptual Level (Logical Level)
Description: Describes what data is stored and the relationships among the data.
Focus: The logical structure of the entire database.
Users: DBAs and developers.
Example: A database model showing tables for Students, Courses, and Enrollments.
3. External Level (View Level)
Description: The user's view of the database; can vary for different users.
Focus: How users interact with data; allows for multiple views depending on user
roles.
Users: End users and application programs.
Example: A teacher sees student grades, while students only see their own
information.
Purpose of the Three-Level Architecture
Data Abstraction: Hides the complexity of the physical storage from users.
Data Independence:
o Logical Data Independence: Changes in the conceptual level don't affect the
external views.
Q2. General Architecture of a Database System
The General Architecture of a database system shows how different components interact to
manage and use the database efficiently. It includes several layers and modules that work
together to process data, manage storage, and ensure security and consistency.
✅ Main Components of Database Architecture
1. Users
Types:
o End Users: Access data through applications (e.g., students checking results).
o Application Programmers: Write programs that interact with the database.
o Database Administrators (DBAs): Manage the overall database system.
2. Application Programs / Queries
Interface between users and the DBMS.
Users send queries (e.g., SQL) through applications.
3. DBMS (Database Management System)
This is the core of the system and includes several subcomponents:
🔹 a. Query Processor
Interprets and executes user queries.
Includes:
o Parser: Checks syntax.
o Query Optimizer: Finds the most efficient way to execute a query.
o Query Executor: Runs the query.
🔹 b. Storage Manager
Manages how data is stored and retrieved.
Includes:
o Buffer Manager
o File Manager
o Transaction Manager (ensures ACID properties)
o Authorization Manager (manages user permissions)
🔹 c. Metadata Catalog (Data Dictionary)
Stores information about the structure of the database (schemas, tables, data types).
4. Database
Actual data stored on storage devices.
Contains user data and metadata (data about data).
📊 Diagram: General Architecture
pgsql
CopyEdit
[ Users ]
[ Application Programs / SQL Queries ]
[ DBMS ]
├── Query Processor
├── Storage Manager
├── Transaction Manager
├── Authorization Manager
└── Metadata Catalog
[ Database (Physical Storage) ]
🎯 Benefits of General Architecture
Efficient data management.
Supports multiple users.
Provides security and consistency.
Enables data abstraction and independence.
Q3. Relational Model
The Relational Model is the most widely used data model in database systems. It was
introduced by E.F. Codd in 1970 and represents data in the form of tables (relations).
✅ Key Concepts of Relational Model
1. Relation (Table)
A relation is a table with rows and columns.
Each table represents an entity.
Example:
Roll No Name Age
101 Alice 20
102 Bob 21
2. Tuple (Row)
A tuple is a single row in a table.
Represents a single record.
3. Attribute (Column)
An attribute is a column in a table.
Represents a property of the entity.
Each attribute has a data type.
4. Domain
A domain is the set of valid values an attribute can take.
Example: Age domain might be integers from 18 to 60.
5. Schema
A schema is the structure of the database – the definition of tables and their
relationships.
6. Degree and Cardinality
Degree = Number of attributes (columns).
Cardinality = Number of tuples (rows).
🔑 Keys in Relational Model
Primary Key: Uniquely identifies each row (e.g., Roll No).
Foreign Key: References the primary key of another table.
Candidate Key: A set of attributes that can uniquely identify a row.
Super Key: A superset of candidate key.
📌 Properties
Data is stored in tabular format.
Relationships between data are maintained using keys.
Easy to understand and use.
Supports powerful query languages like SQL.
Q4. ER Model (Entity-Relationship Model)
The ER Model (Entity-Relationship Model) is a high-level data model used to visually
describe the data and its relationships. It was introduced by Peter Chen in 1976 and is
mainly used in the database design phase.
✅ Main Components of ER Model
1. Entity
An object or thing in the real world with an independent existence.
Example: Student, Course, Teacher.
Entity Set: A collection of similar entities.
Types:
o Strong Entity: Exists independently.
o Weak Entity: Depends on another entity.
2. Attributes
Describe properties of an entity.
Example: For a Student entity — RollNo, Name, Age.
Types:
o Simple: Cannot be divided (e.g., Age).
o Composite: Can be divided (e.g., Name → First Name, Last Name).
o Derived: Can be calculated (e.g., Age from DOB).
o Multivalued: Can have multiple values (e.g., phone numbers).
3. Relationship
Shows how two or more entities are related.
Example: A Student enrolls in a Course.
Types of Relationships:
o One-to-One (1:1)
o One-to-Many (1:N)
o Many-to-Many (M:N)
4. Keys
Primary Key: An attribute that uniquely identifies each entity.
Example: RollNo for Student.
📊 ER Diagram Symbols
Concept Symbol
Entity Rectangle
Attribute Ellipse
Relationship Diamond
Line Connects all
Primary Key Underlined
Multivalued Double ellipse
Weak Entity Double rectangle
Derived Attr. Dashed ellipse
📌 Purpose of ER Model
Helps in conceptual design of the database.
Makes it easy to understand data requirements and structure.
Used to create relational schemas in the next phase.
Q5. Normalization – 1NF, 2NF, 3NF
Normalization is the process of organizing data in a database to reduce data redundancy
and improve data integrity. It involves applying a series of rules called normal forms.
✅ 1NF (First Normal Form)
✔ Definition:
A table is in 1NF if:
All attributes (columns) contain only atomic (indivisible) values.
There are no repeating groups or arrays.
📌 Example (Before 1NF – Not Atomic):
StudentID Name Courses
101 Alice Math, English
102 Bob Science, History
👉 Issue: "Courses" contains multiple values.
✅ After 1NF:
StudentID Name Course
101 Alice Math
101 Alice English
102 Bob Science
102 Bob History
✅ 2NF (Second Normal Form)
✔ Definition:
A table is in 2NF if:
It is already in 1NF, and
Every non-prime attribute is fully functionally dependent on the entire primary key
(no partial dependency).
Applies mainly to composite keys.
📌 Example (Before 2NF):
StudentID Course StudentName Marks
101 Math Alice 85
101 English Alice 90
👉 Issue: StudentName depends only on StudentID, not the full key (StudentID + Course).
✅ After 2NF:
Student Table:
StudentID StudentName
101 Alice
StudentID StudentName
Enrollment Table:
StudentID Course Marks
101 Math 85
101 English 90
✅ 3NF (Third Normal Form)
✔ Definition:
A table is in 3NF if:
It is already in 2NF, and
There is no transitive dependency (non-prime attribute depends only on the primary
key, not on another non-prime attribute).
📌 Example (Before 3NF):
StudentID StudentName DeptID DeptName
101 Alice D1 CS
102 Bob D2 Physics
👉 Issue: DeptName depends on DeptID, not directly on StudentID.
✅ After 3NF:
Student Table:
StudentID StudentName DeptID
101 Alice D1
Department Table:
DeptID DeptName
D1 CS
D2 Physics
🧠 Summary Table
Normal Form Main Rule Fixes
1NF Atomic values, no repeating groups Multivalued fields
Normal Form Main Rule Fixes
2NF No partial dependency on composite key Partial dependency
3NF No transitive dependency Transitive dependency
Q6. Projection Operator (π) in Relational Algebra
The Projection Operator, denoted by π (pi), is used in relational algebra to select specific
columns (attributes) from a relation (table).
✅ Definition:
The projection operator returns a new relation containing only the specified attributes,
removing duplicates automatically.
📌 Syntax:
php-template
CopyEdit
π<attribute_list>(Relation_Name)
📊 Example:
Let’s say we have a relation Student:
RollNo Name Age Dept
101 Alice 20 CS
102 Bob 21 CS
103 Charlie 20 IT
✔ Example 1: Project Names of all students
scss
CopyEdit
πName(Student)
Result:
Name
Alice
Bob
Name
Charlie
✔ Example 2: Project Age and Dept
scss
CopyEdit
πAge,Dept(Student)
Result:
Age Dept
20 CS
21 CS
20 IT
Note: If there were duplicate rows in the output, projection would eliminate them.
🎯 Key Points:
Projection removes columns, not rows.
It removes duplicates from the result set.
It’s useful when you only need specific attributes from a table.
Q6. Selection and Projection Operators in Relational Algebra
In relational algebra, Selection and Projection are two fundamental operations used to
retrieve specific data from a relation (table).
✅ 1. Selection Operator (σ)
✔ Definition:
The Selection operator (σ) is used to select rows (tuples) from a relation that satisfy a given
condition.
📌 Syntax:
php-template
CopyEdit
σ<condition>(Relation_Name)
📊 Example:
Given a relation Student:
RollNo Name Age Dept
101 Alice 20 CS
102 Bob 21 CS
103 Charlie 20 IT
Query: Select students from CS department
bash
CopyEdit
σDept = 'CS'(Student)
Result:
RollNo Name Age Dept
101 Alice 20 CS
102 Bob 21 CS
✅ 2. Projection Operator (π)
✔ Definition:
The Projection operator (π) is used to select specific columns (attributes) from a relation.
📌 Syntax:
php-template
CopyEdit
π<attribute_list>(Relation_Name)
📊 Example:
Query: Get only the names of students
scss
CopyEdit
πName(Student)
Result:
Name
Alice
Name
Bob
Charlie
🔁 Combined Example (Selection + Projection)
Query: Get names of students from the CS department
scss
CopyEdit
πName(σDept = 'CS'(Student))
Step-by-step:
1. σDept = 'CS'(Student) → Selects CS students.
2. πName(...) → Projects only the names.
Final Output:
Name
Alice
Bob
🎯 Summary Table:
Operator Symbol Works On Returns Purpose
Selection σ Rows Subset of rows Filters by condition
Projection π Columns Subset of cols Picks specific fields
Q7. Join Operator in Relational Algebra
The Join Operator (⨝) is used in relational algebra to combine related tuples from two
different relations (tables) based on a common attribute.
✅ 1. Purpose of Join
To retrieve data spread across multiple tables.
Matches rows from two relations based on a specified condition (usually a common
key).
🔄 Types of Join Operators
1. Theta Join (θ Join)
Symbol: R ⨝θ S
Combines tuples from R and S where the condition θ (e.g., =, <, >) is true.
Example:
nginx
CopyEdit
Student ⨝Student.DeptID = Department.DeptID Department
2. Equi Join
A special case of Theta Join where the condition is based on equality (=).
3. Natural Join (⨝)
Automatically joins two relations by matching attributes with the same name and
removing duplicates in the result.
Syntax:
nginx
CopyEdit
R⨝S
Example: Let’s say we have:
Student:
RollNo Name DeptID
101 Alice D1
102 Bob D2
Department:
DeptID DeptName
D1 CS
D2 IT
Natural Join:
nginx
CopyEdit
Student ⨝ Department
Result:
RollNo Name DeptID DeptName
101 Alice D1 CS
102 Bob D2 IT
4. Outer Join (Left, Right, Full)
Not a part of pure relational algebra, but common in SQL and extended relational models:
Left Outer Join: Keeps all rows from the left table.
Right Outer Join: Keeps all rows from the right table.
Full Outer Join: Keeps all rows from both tables.
📌 Summary of Join Types
Join Type Description
Theta Join Uses a condition (any comparison operator)
Equi Join Condition is equality
Natural Join Matches columns with same name automatically
Outer Joins Include unmatched rows from one/both sides
Q8. DDL and DML Commands in SQL
In SQL (Structured Query Language), commands are grouped based on their purpose. Two of
the main categories are DDL and DML.
✅ 1. DDL – Data Definition Language
Purpose:
Used to define and modify the structure** of database objects like tables, schemas, indexes,
etc.
📌 Common DDL Commands:
Command Description Example
CREATE Creates a new table, view, or database CREATE TABLE Students (...);
ALTER Modifies the structure of an existing table ALTER TABLE Students ADD Age INT;
DROP Deletes a table or database permanently DROP TABLE Students;
Command Description Example
Deletes all rows in a table (structure
TRUNCATE TRUNCATE TABLE Students;
remains)
RENAME TABLE Students TO
RENAME Renames a table or column
Learners;
🔒 DDL changes are auto-committed (cannot be rolled back).
✅ 2. DML – Data Manipulation Language
Purpose:
Used to manipulate data stored in tables (insert, update, delete, retrieve).
📌 Common DML Commands:
Command Description Example
Retrieves data from one or more
SELECT SELECT * FROM Students;
tables
INSERT Adds new data into a table INSERT INTO Students VALUES (101, 'Alice');
UPDATE Students SET Name = 'Bob' WHERE
UPDATE Modifies existing data in a table
ID = 101;
DELETE Removes data from a table DELETE FROM Students WHERE ID = 101;
🔁 DML operations can be rolled back using ROLLBACK.
🧠 Key Differences Between DDL and DML:
Feature DDL DML
Affects Structure of tables Data inside the tables
Rollback ❌ Not allowed (auto commit) ✅ Allowed
Usage Create/modify tables Add/update/delete data
Q9. NOT Operator in SQL
The NOT operator in SQL is a logical operator used to negate a condition. It returns the
opposite result of the condition it's applied to.
✅ Syntax:
sql
CopyEdit
SELECT * FROM table_name
WHERE NOT condition;
🔄 How It Works:
If the condition is TRUE, NOT makes it FALSE.
If the condition is FALSE, NOT makes it TRUE.
📌 Examples:
🔹 Example 1: Using NOT with Equality
sql
CopyEdit
SELECT * FROM Students
WHERE NOT Dept = 'CS';
➡️This will return all students not in the CS department.
🔹 Example 2: Using NOT with IN
sql
CopyEdit
SELECT * FROM Students
WHERE Name NOT IN ('Alice', 'Bob');
➡️Returns students except Alice and Bob.
🔹 Example 3: Using NOT with BETWEEN
sql
CopyEdit
SELECT * FROM Students
WHERE Age NOT BETWEEN 18 AND 22;
➡️Returns students whose age is not between 18 and 22.
🔹 Example 4: Using NOT with EXISTS
sql
CopyEdit
SELECT * FROM Courses
WHERE NOT EXISTS (
SELECT * FROM Enrollments WHERE Courses.CourseID = Enrollments.CourseID
);
➡️Returns courses that have no enrollments.
🧠 Quick Notes:
NOT is often used with operators like IN, BETWEEN, EXISTS, LIKE.
It helps filter out unwanted data.
Q10. Responsibilities of a Database Administrator (DBA)
A Database Administrator (DBA) is responsible for the management, maintenance, and
security of a database system. The role requires ensuring that the database runs smoothly,
efficiently, and securely, while also being reliable and available for users and applications.
✅ Key Responsibilities of a DBA:
1. Database Design and Architecture
Database Design: Work with developers and stakeholders to design efficient
databases, ensuring the structure meets the business requirements.
Schema Design: Define and design tables, indexes, relationships, constraints, and
views.
Normalization/Denormalization: Ensure the database is normalized to avoid
redundancy but can denormalize where necessary for performance.
2. Installation and Configuration
Database Installation: Install and configure the DBMS (Database Management
System) on the server.
Setup: Configure database parameters, security settings, and performance tuning
options.
Upgrades: Apply patches and upgrades to keep the DBMS up to date.
3. Data Security and Access Control
User Access Management: Create and manage user accounts, roles, and privileges.
Encryption: Implement encryption for data-at-rest and data-in-transit.
Backup and Recovery: Design and implement data backup strategies (full,
incremental, etc.) and ensure disaster recovery procedures are in place.
Audit Trails: Monitor and maintain logs for tracking user activities to ensure data
integrity and compliance.
4. Performance Tuning and Optimization
Query Optimization: Optimize queries to ensure efficient performance, including
indexing and adjusting SQL queries.
Resource Management: Monitor database performance, CPU, memory, and disk
utilization to ensure optimal database performance.
Indexing: Create and manage indexes to improve query speed and performance.
5. Data Integrity and Consistency
Enforce Integrity Constraints: Ensure that data in the database is consistent,
accurate, and reliable by enforcing integrity constraints such as primary keys, foreign
keys, and check constraints.
Transaction Management: Manage transactions, ensuring they are properly
committed or rolled back to maintain ACID properties (Atomicity, Consistency,
Isolation, Durability)
6. Backup and Recovery Management
Backup Strategy: Plan, execute, and monitor regular backups to protect data from
loss.
Restore Procedures: Develop and implement restore procedures to recover data in
case of failure or corruption.
7. Database Maintenance
Data Migration: Manage data migration and transformation during system upgrades
or migrations to new platforms.
Routine Maintenance: Perform regular database maintenance tasks, such as
rebuilding indexes, removing obsolete data, and running diagnostic checks.
Disk Management: Oversee the allocation and management of storage resources.
8. Troubleshooting and Issue Resolution
Problem Diagnosis: Troubleshoot issues such as performance bottlenecks, deadlocks,
or connectivity issues.
Error Management: Respond to and resolve database-related errors, such as
database crashes, corruption, or locking issues.
9. High Availability and Disaster Recovery
Replication: Set up and manage database replication to ensure data is consistently
available in different locations.
Clustering: Implement and manage database clustering solutions for high availability.
Disaster Recovery Plans: Design and test disaster recovery strategies, including
setting up redundant systems for critical applications.
10. Compliance and Documentation
Regulatory Compliance: Ensure the database meets regulatory standards and
policies such as GDPR, HIPAA, or SOX.
Documentation: Maintain thorough documentation for database design,
configuration, and policies to help in troubleshooting and audits
🧠 Skills Required for a DBA:
Knowledge of database technologies (e.g., MySQL, Oracle, SQL Server, PostgreSQL).
Strong understanding of SQL, database architecture, and performance tuning.
Experience with backup and recovery solutions, disaster recovery planning, and
security practices.
Familiarity with operating systems and hardware environments.
Troubleshooting skills for database issues and performance prob
SQL Features
SQL (Structured Query Language) is a powerful language used for managing and
manipulating relational databases. It provides various features that help in data
management, querying, and security.
✅ Key Features of SQL:
1. Data Querying
Select Data: SQL allows users to query the database to retrieve data using the SELECT
statement.
Example:
sql
CopyEdit
SELECT * FROM Students;
Filtering Data: SQL enables filtering of data based on specific conditions using the
WHERE clause.
Example:
sql
CopyEdit
SELECT * FROM Students WHERE Age > 18;
2. Data Manipulation
Insert Data: SQL allows insertion of new data into tables using the INSERT INTO
statement.
Example:
sql
CopyEdit
INSERT INTO Students (Name, Age, Department) VALUES ('Alice', 21, 'CS');
Update Data: SQL allows modification of existing data with the UPDATE statement.
Example:
sql
CopyEdit
UPDATE Students SET Age = 22 WHERE Name = 'Alice';
Delete Data: SQL allows deletion of data from tables using the DELETE statement.
Example:
sql
CopyEdit
DELETE FROM Students WHERE Name = 'Alice';
3. Data Definition
Create Tables: SQL allows creation of new tables, views, or schemas using the
CREATE statement.
Example:
sql
CopyEdit
CREATE TABLE Students (ID INT, Name VARCHAR(100), Age INT);
Alter Tables: SQL enables modification of the structure of an existing table with the
ALTER statement.
Example:
sql
CopyEdit
ALTER TABLE Students ADD COLUMN Department VARCHAR(50);
Drop Tables: SQL allows deletion of tables from the database with the DROP
statement.
Example:
sql
CopyEdit
DROP TABLE Students;
4. Data Integrity
Primary Key: SQL allows enforcing a primary key to uniquely identify each record in a
table.
Example:
sql
CopyEdit
CREATE TABLE Students (ID INT PRIMARY KEY, Name VARCHAR(100));
Foreign Key: SQL supports foreign keys to maintain referential integrity between two
related tables.
Example:
sql
CopyEdit
CREATE TABLE Enrollments (StudentID INT, CourseID INT, FOREIGN KEY (StudentID)
REFERENCES Students(ID));
Constraints: SQL allows defining constraints like NOT NULL, UNIQUE, CHECK, etc., to
maintain data validity.
Example:
sql
CopyEdit
CREATE TABLE Students (ID INT NOT NULL, Age INT CHECK (Age >= 18));
5. Data Control
Granting Permissions: SQL allows specifying user privileges and permissions using
the GRANT statement.
Example:
sql
CopyEdit
GRANT SELECT, INSERT ON Students TO user1;
Revoking Permissions: SQL also enables revoking user privileges using the REVOKE
statement.
Example:
sql
CopyEdit
REVOKE SELECT ON Students FROM user1;
6. Aggregation and Grouping
Aggregate Functions: SQL supports built-in aggregate functions like COUNT(), SUM(),
AVG(), MAX(), MIN() to calculate summaries of data.
Example:
sql
CopyEdit
SELECT COUNT(*) FROM Students;
Grouping Data: SQL enables grouping data using the GROUP BY clause.
Example:
sql
CopyEdit
SELECT Department, COUNT(*) FROM Students GROUP BY Department;
Filtering Groups: You can filter groups with the HAVING clause, which is like WHERE
but for groups.
Example:
sql
CopyEdit
SELECT Department, COUNT(*) FROM Students GROUP BY Department HAVING COUNT(*) >
10;
9. Transactions
Transaction Control: SQL supports transactions to ensure that a series of operations
are executed as a single unit, using BEGIN, COMMIT, and ROLLBACK statements.
Example:
sql
CopyEdit
BEGIN;
UPDATE Students SET Age = 23 WHERE Name = 'Alice';
COMMIT;
If something goes wrong, you can rollback to revert changes:
sql
CopyEdit
ROLLBACK;
10. View and Indexing
Views: SQL allows the creation of views to represent complex queries as virtual
tables.
Example:
sql
CopyEdit
CREATE VIEW CS_Students AS SELECT * FROM Students WHERE Department = 'CS';
Indexes: SQL supports the creation of indexes to speed up the retrieval of rows from
large tables.
Example:
sql
CopyEdit
CREATE INDEX idx_name ON Students(Name);
🧠 Key Benefits of SQL Features:
Efficiency: Allows efficient querying, manipulation, and modification of data.
Security: Provides security features through access control and permissions.
Flexibility: Supports a wide range of operations, from simple queries to complex
joins and aggregations.
Standardization: SQL is a standardized language, which makes it widely used across
different relational database systems.
Q Data Dictionary in Databases
A Data Dictionary is a systematic collection of information about the data in a database,
which is stored within the database itself. It contains metadata, meaning data that describes
other data. The data dictionary holds critical information about the database's structure,
such as tables, columns, data types, constraints, relationships, and access rights.
✅ Functions of a Data Dictionary
The data dictionary performs several key functions in a database system:
1. Storing Metadata
Definition: The primary function of the data dictionary is to store metadata about
the database objects, such as:
o Tables: Names, structure (columns, data types), and constraints.
o Indexes: Names, column references, and types.
o Views: Definitions of views and the underlying tables.
o Relationships: Foreign keys and associations between tables.
Example: Information on the structure of a table (columns, data types) is stored in the data
dictionary.
2. Data Integrity and Constraints
Definition: The data dictionary helps maintain data integrity by storing information
about constraints such as:
o Primary keys
o Foreign keys
o Unique constraints
o Check constraints
This ensures that the relationships and integrity of data are always maintained when
interacting with the
3. Access Control
Definition: The data dictionary stores information about user roles and permissions,
managing who can access which data and what actions they are allowed to perform
(e.g., SELECT, INSERT, UPDATE, DELETE).
Example: The system stores details of users, their access privileges, and granted roles in the
data dictionary.
4. Schema Management
Definition: The data dictionary keeps track of the database schema, which is the
organization of tables and the relationships between them. It can also store the
schema of indexes, views, and other objects, allowing administrators to easily
manage the database structure.
5. Data Type Information
Definition: It stores data types for columns in tables, which helps ensure consistency
when data is inserted or queried. The data dictionary keeps track of:
o Integer, String, DateTime, etc.
o Constraints on those data types (e.g., length of string fields).
6. Database Objects Management
Definition: The data dictionary provides a catalog of database objects, such as
tables, indexes, views, stored procedures, and triggers. This makes it easier for
developers and administrators to manage and navigate through the database
schema.
7. Query Optimization
Definition: By analyzing the information stored in the data dictionary, query
optimizers can generate more efficient query plans. The dictionary helps the
database engine understand the structure and relationships between tables, which is
essential for optimizing complex queries.
8. System Monitoring
Definition: The data dictionary also stores information about the system’s
performance, including statistics like table sizes, row counts, and index usage. This
helps database administrators monitor and optimize the database performance.
9. Database Documentation
Definition: The data dictionary serves as an automatic and up-to-date form of
documentation for the database system. It’s used to describe the relationships and
structure of the database to developers, administrators, and other stakeholders.
📊 Example of Data Dictionary Content
Here is an example of some of the metadata stored in a typical data dictionary:
Table Name Column Name Data Type Key Default Value Constraints
Students StudentID INT Primary NOT NULL
Students Name VARCHAR(100) NOT NULL
Students Age INT CHECK (Age >= 18)
Courses CourseID INT Primary NOT NULL
Courses CourseName VARCHAR(255) NOT NULL
Enrollments StudentID INT Foreign REFERENCES Students
Enrollments CourseID INT Foreign REFERENCES Courses
🎯 Key Points to Remember:
The data dictionary provides critical information about database schema and
metadata.
It helps ensure data integrity, data access control, and system performance
optimization.
The dictionary plays a significant role in query optimization by providing insight into
database structure and relationships.
Q .Dependencies in Un-Normalized Database
In an un-normalized database, data is typically stored in a form that has not undergone the
process of normalization. This can lead to various types of data anomalies and
redundancies. In such a database, multiple dependencies can exist that can cause issues like
update anomalies, insertion anomalies, and deletion anomalies. To understand these
dependencies, let's break down some common types:
✅ 1. Multivalued Dependency (MVD)
Definition: A multivalued dependency occurs when one attribute determines a set
of values for another attribute (or set of attributes) independently of other
attributes.
Explanation: In an un-normalized table, multiple values for certain attributes might
be stored together in a single row. This leads to redundancy, as the same data might
be repeated for each value in a set.
Example: Suppose we have a table where a student can have multiple phone
numbers and email addresses, stored together:
StudentID Name Phone Numbers Email Addresses
101 Alice 12345, 67890 alice@domain.com, alice123@domain.com
Here, the student's phone numbers and email addresses are stored together, and the
dependency between Phone Numbers and Email Addresses is a multivalued dependency.
This is problematic as adding or removing a phone number or email results in repeating the
student's name and ID
✅ 2. Partial Dependency
Definition: A partial dependency occurs when a non-prime attribute (attribute that
is not part of the candidate key) is dependent on only part of a composite primary
key.
Explanation: In an un-normalized database, if a table has a composite key (a key
made up of more than one attribute), non-prime attributes may depend on only part
of that composite key.
Example: Consider a table where a StudentCourse table has a composite primary key
(StudentID, CourseID) and stores the course's instructor and the student's grade:
StudentID CourseID Instructor Grade
101 C101 Dr. Smith A
102 C101 Dr. Smith B
Here, the Instructor depends only on CourseID, which is a part of the composite key, but not
on the full primary key (StudentID, CourseID). This is a partial dependency because
Instructor is not fully dependent on the entire key. This violates the rule of 2NF (Second
Normal Form).
✅ 3. Transitive Dependency
Definition: A transitive dependency occurs when one non-prime attribute depends
on another non-prime attribute through a third attribute, forming a chain of
dependencies.
Explanation: This type of dependency can lead to redundancy because a non-prime
attribute indirectly depends on a primary key through other non-prime attributes.
Example: Suppose we have a table with student information, where Instructor
depends on CourseID, and Grade depends on StudentID and CourseID:
StudentID CourseID Instructor Grade
101 C101 Dr. Smith A
102 C101 Dr. Smith B
Here, Instructor depends on CourseID, and Grade depends on both StudentID and CourseID.
The dependency Instructor → CourseID → Grade forms a transitive dependency. This could
be avoided by splitting the data into separate tables to remove redundancy.
✅ 4. Functional Dependency
Definition: A functional dependency occurs when one attribute (or set of attributes)
uniquely determines another attribute in the table.
Explanation: Functional dependencies are common in un-normalized databases and
are often the basis for normalization. A table might have functional dependencies
that are not well-structured or efficient.
Example: A table may contain information like StudentID, CourseID, Instructor, and
Grade, where StudentID → Grade (a student’s grade is determined by their
StudentID), and CourseID → Instructor (the course determines the instructor).
✅ 5. Join Dependency
Definition: A join dependency occurs when a relation cannot be decomposed into
smaller relations without losing some information or causing redundancy.
Explanation: In an un-normalized database, tables may be poorly structured, causing
difficulties in decomposition and efficient joins.
Example: If a student’s information (student ID, course ID, instructor, and grade) is
stored in a single table, a join dependency exists when we attempt to split the data
into logical smaller tables (e.g., Student, Course, and Enrollment) but cannot
maintain all relationships effectively without recombining them
✅ 6. Redundancy in Un-Normalized Form
Definition: Redundancy refers to the repetition of data in a database, leading to
inefficiencies, wasted storage, and potential for anomalies.
Explanation: In un-normalized databases, data is often repeated, leading to issues
when updating, inserting, or deleting data. For instance, storing multiple addresses
for a person in a single column can result in redundant data.
Example:
StudentID Name Phone Number
101 Alice 12345, 67890
101 Alice 12345, 67890
The Phone Number is repeated, which is an example of redundancy.
📊 Summary of Dependencies in Un-Normalized Database:
Dependency Type Description Example
Multivalued An attribute determines a set of values for Multiple phone numbers
Dependency another attribute and emails
A non-prime attribute depends on part of a Instructor depends only
Partial Dependency
composite primary key on CourseID
Transitive A non-prime attribute depends on another Instructor → CourseID →
Dependency non-prime attribute Grade
Functional
One attribute determines another attribute StudentID → Grade
Dependency
Relation cannot be decomposed without Issues when splitting
Join Dependency
losing information tables
Redundancy Repetitive data causing inefficiencies Repeated phone numbers
🔄 Impact of These Dependencies:
Update Anomalies: Redundant data leads to inconsistencies when data is updated.
Insertion Anomalies: Inserting new records might require repeating information
multiple times.
Deletion Anomalies: Deleting a record can result in unintended loss of information
🚀 Conclusion:
In an un-normalized database, various dependencies such as partial, transitive,
multivalued, and functional dependencies cause redundancy and inefficiencies. Normalizing
the database helps reduce these dependencies, improving the database’s structure,
integrity, and efficiency.