Course Code: CS404PC
Unit -1 Part-1
DATABASE MANAGEMENT
SYSTEMS
By B. Lokesh Joel
Unit-1:
• Database System Applications: ▪ Introduction to Database Design:
• A Historical Perspective, – Database Design and ER Diagrams,
• File Systems versus a DBMS, – Entities, Attributes, and Entity Sets,
• The Data Model, – Relationships and Relationship Sets,
• Levels of Abstraction in a DBMS, – Additional Features of the ER Model,
• Data Independence, – Conceptual Design With the ER Model.
• Structure of a DBMS
Database Management System (DBMS)
Overview
Definition: Purpose:
DBMS is a system for managing ❑ Efficiently store and retrieve information.
interrelated data and a set of programs ❑ Manage large-scale data securely and prevent
to access that data. unauthorized access.
❑ Avoid anomalies when multiple users access
the database simultaneously.
Structured vs. Complex Data
▪ Structured Data:
– Example: University databases (courses, students, registrations).
– Fixed format and standard organization.
▪ Complex Data:
– Example: Social networks (posts, images, videos, links).
– Varying structure and formats.
Applications of Database Systems
▪ Enterprise Information: ▪ Airlines:
– Reservations, schedules, geographically distributed
– Sales: Customer, product, and purchase data. databases.
– Accounting: Payments, balances, and assets.
▪ Telecommunication:
– HR: Employee records, salaries, and payroll.
– Call records, data usage, bills, and network data.
▪ Manufacturing:
▪ Web-Based Services:
– Supply chain and inventory tracking.
– Social Media: User profiles, posts, likes.
▪ Banking and Finance:
▪ Online Retailers: Purchase history, recommendations.
– Banking: Accounts, loans, and transactions.
– Credit Card: Transactions and statements. ▪ Online Ads: Click history for targeted marketing.
– Finance: Market data for trading. ▪ Other Examples:
▪ Education: – Document databases: Articles, patents, research papers.
– Student records, course registrations, grades. – Navigation systems: Routes, transportation data.
Historical Perspective
▪ 1. Early Developments (1960s) ▪ 2. Relational Model Revolution (1970s)
– Integrated Data Store:
• First general-purpose DBMS by Charles – Relational Data Model:
Bachman at General Electric. • Proposed by Edgar Codd at IBM.
• Introduced the Network Data Model.
• Simplified data organization and retrieval.
• Bachman received the ACM Turing Award
(1973) for his contributions. • Sparked the development of relational DBMSs.
– IMS DBMS: • Codd won the 1981 Turing Award for this work.
• Developed by IBM, based on the Hierarchical
Data Model. – SQL (Structured Query Language):
• Still used in large systems today. • Introduced by IBM's System R Project.
– SABRE System: • Became the standard query language for relational
• Created by American Airlines and IBM for databases.
real-time airline reservations.
• Enabled multiple users to access data • Standardized by ANSI and ISO in the late 1980s.
simultaneously.
• Powers modern travel platforms like
Travelocity.
Historical Perspective (cont..)
▪ 3. Advancements in the 1980s and 1990s ▪ 4. DBMS in the Internet Age
– Dominance of Relational Databases: – Web Integration:
• Transition from file-based systems to DBMS for web applications.
• Relational DBMSs became the standard for
corporate data management. • Websites now store data in DBMS, making it accessible via browsers.
• Widespread adoption of SQL. – Internet-Friendly Features:
• Vendors enhanced DBMS capabilities for online usage.
– Enhanced Features:
• Support for new data types like images and text. ▪ 5. Modern Applications
• Development of data warehouses for enterprise – Emerging Areas:
analysis.
• Multimedia Databases: Store and retrieve images, videos, and
audio.
– ERP and MRP Systems:
• Streaming Data: Manage real-time data feeds.
• Software packages like SAP, Oracle, and
PeopleSoft added application-oriented features • Digital Libraries: Centralized storage for books and research
articles.
for:
• Inventory management. – Scientific Projects:
• Human Genome Mapping: Organizing complex genetic data.
• Human resources.
• NASA Earth Observation System: Analyzing planetary data.
• Financial analysis.
– Data Mining:
• Allowed customization for organizations,
reducing costs. • Extracting insights from large data sets for better decision-making.
File Systems vs. DBMS
▪ 1. Motivation for Using a DBMS
– Scenario:
• A company has 500 GB of data about employees, departments, products, sales, etc.
• The data needs to:
• Be accessed concurrently by multiple users.
• Support quick queries.
• Handle consistent updates.
• Restrict access to sensitive information (e.g., salaries).
File Systems vs. DBMS
▪ 2. Drawbacks of File Systems 3. Complex Queries:
– Special programs must be written for every query.
– Managing large volumes of data through
– Searching large volumes of data is slow and inefficient.
operating system files has several challenges:
4. Concurrency Issues:
1. Limited Memory Handling:
– When multiple users make changes simultaneously, data
• Most systems don’t have enough main memory inconsistency can occur.
to hold 500 GB of data.
– Handling concurrent access manually is complex.
• Data must be stored on secondary storage (e.g.,
disk) and fetched as needed, adding 5. Crash Recovery:
complexity. – If the system crashes during data modification, restoring the
data to a consistent state is difficult.
2. Addressing Limits:
6. Inadequate Security:
• Systems with 32-bit addressing can only
reference up to 4 GB of data directly. – Operating systems offer simple password-based security, which
is insufficient for enforcing policies where different users have
• Managing references to large datasets requires varying access levels.
additional programming.
Advantages of a DBMS
▪ Using a Database Management System (DBMS) to ▪ 3. Data Integrity and Security
manage data offers several significant advantages:
– Data Integrity:
▪ 1. Data Independence • The DBMS enforces constraints, e.g., ensuring salaries
– Definition: Applications are not exposed to the details do not exceed department budgets before insertion.
of how data is stored or represented.
– Data Security:
– Benefit: A DBMS provides an abstract view of data,
• Access controls restrict visibility based on user roles.
allowing changes in storage or representation without
impacting application programs. • Example: Managers see salary details, while general
employees cannot.
▪ 2. Efficient Data Access
▪ 4. Data Administration
– Definition: DBMS employs advanced techniques for
efficient storage and retrieval of data. – Centralized Management: Centralized administration minimizes
redundancy and ensures consistency.
– Example: Optimized indexing and query execution are
particularly important for data stored on external – Professional Oversight: Experienced administrators optimize data
devices like disks. representation and storage for efficiency.
Advantages of a DBMS
▪ 5. Concurrent Access and Crash ▪ 6. Reduced Application Development
Recovery Time
– Simplified Development:
– Concurrent Access:
• Common data management functions (e.g., query
• The DBMS schedules data access so users processing, transaction handling) are built into the
perceive it as single-user interaction, DBMS.
preventing conflicts.
– High-Level Interface:
– Crash Recovery: • Features like SQL facilitate quicker development.
• Protects data integrity by restoring – Improved Reliability:
consistent states after failures.
• Applications built on DBMS are robust as the DBMS
handles critical tasks like security and crash recovery.
View of Data
▪ A database system is a collection of interrelated data and
programs for accessing and modifying it.
▪ Purpose: Provide an abstract view of data, hiding storage and
maintenance details.
1. Data Models
2. Data Abstraction
3. Instances and Schemas
The Data Model
▪ Conceptual tools for describing data, relationships,
semantics, and constraints.
▪ Types of Data Models:
1. Relational Model
2. Entity-Relationship (E-R) Model
3. Semi-Structured Data Model
4. Object-Based Data Model
The Data Model
1. Relational Model:
– Represents data in
tables (relations).
– Each table has columns
(attributes) and rows
(records).
– Most widely used model.
– Example: Instructor
table with ID, Name,
Dept Name, and Salary.
The Data Model
2. Entity-Relationship (E-R)
Model:
– Uses entities (real-world
objects) and relationships
among them.
– Commonly used in database
design.
The Data Model
3. Semi-Structured Data Model:
– Allows varying attributes for
data items of the same type.
– Examples: JSON, XML.
4. Object-Based Data Model:
– Combines object-oriented
concepts with relational
databases.
– Supports encapsulation,
methods, and object identity.
Levels of Abstraction in a DBMS
▪ Data Abstraction: Data abstraction
simplifies user interaction with the database
by hiding complexity.
▪ Allows efficient data retrieval using complex
data structures without requiring users to
understand them.
▪ Levels of Data Abstraction
1. Physical Level
2. Logical Level
3. View Level
Levels of Abstraction in a DBMS
1. Physical Level
– Lowest level of abstraction.
– Describes how data is stored using low-level
data structures (e.g., blocks of bytes, indexes).
– Examples:
• Tables stored as sequences of records with
delimiters (e.g., commas or new-line
characters).
• Use of indexes for efficient data retrieval.
– Used by database administrators for
optimization.
Levels of Abstraction in a DBMS
2. Logical Level:
– Middle level of abstraction.
– Describes what data is stored and the relationships between data
entities.
– Example:
• Logical definition of an Instructor record:
– Relationships:
• Dept Name in the Instructor table must match the Department table.
– Provides physical data independence, allowing physical storage
changes without affecting applications.
– Used by programmers and database administrators for schema
design and development.
Levels of Abstraction in a DBMS
3. View Level:
– Highest level of abstraction.
– Provides customized views of the database for
different users.
– Example:
• A clerk in a registrar’s office can view only student records
but cannot access instructor salaries.
– Simplifies interaction and enhances security by
restricting access to sensitive data..
Instances and Schemas
▪ Instance:
– The current collection of data stored in a database at a
specific moment in time.
– Changes over time as data is inserted, updated, or deleted.
▪ Schema:
– The overall design of the database, including the structure
and organization of data.
– Comparable to variable declarations and type definitions in
programming.
Data Independence
▪ Data Independence: The ability of a DBMS to allow application
programs to remain unaffected by changes in:
– The logical structure of the database (Logical Data Independence).
– The physical storage of the database (Physical Data Independence).
▪ Achieved using the three levels of data abstraction:
– External Schema (View Level).
– Conceptual Schema (Logical Level).
– Physical Schema (Physical Level).
Data Independence
▪ Types of Data Independence
1. Logical Data Independence
2. Physical Data Independence
Data Independence
1. Logical Data Independence
– Insulates users from changes in the logical structure of the database.
– Changes to the conceptual schema do not affect application programs.
– Example:
• Original Relation:
Faculty(fid: string, fname: string, sal: real)
• Updated Relations:
Faculty_public(fid: string, fname: string, office: integer)
Faculty_private(fid: string, sal: real)
– Changes:
• Confidential data (salaries) is moved to a new table Faculty_private.
• A new field office is added to Faculty_public.
Data Independence
2. Physical Data Independence
– Insulates users from changes in physical storage details.
– Changes to file structure, data layout, or indexes do not affect applications.
– Examples :
• Changing how data is stored on disk.
• Modifying indexing techniques or storage formats.
• Switch data storage format from CSV files to binary files.
Benefits Data Independence
The database can adapt to changes
Flexibility: in structure or storage without
disrupting applications.
Allows sensitive information (e.g.,
Security: salaries) to be stored separately
while maintaining usability.
Applications do not need
Reduced Maintenance: modification for structural or
storage changes.
Structure of a DBMS
Structure of a DBMS
▪ A DBMS manages queries, evaluates them efficiently, and interacts with the database to
provide results.
▪ Focuses on core functionality like query processing, optimization, concurrency control,
and recovery.
▪ 2. Key Components of a DBMS
1. SQL Interface:
• Accepts SQL commands from users or application programs.
• Example: Commands from users like customers, travel agents, application programmers.
2. Query Evaluation Engine:
• Processes SQL queries in several stages:
1. Query Parsing:
• Converts a user query into a machine-readable format.
2. Query Optimization:
• Uses metadata from the System Catalog to generate an efficient execution plan.
• Execution plan: A tree of relational operators with details on access methods.
3. Plan Execution:
• Executes the optimized plan using relational operators.
Structure of a DBMS
3.Layers of a DBMS
1. Relational Operators Layer:
• Implements operations like selection, projection,
3. Buffer Manager:
and join. • Handles movement of data pages between
disk and main memory.
• Works with lower layers to retrieve data
efficiently. • Ensures that frequently accessed data is
available in memory for efficient processing.
2. File and Access Methods Layer:
• Manages files, which are collections of records or
4.Disk Space Manager:
pages. • Lowest layer responsible for managing disk
• Supports: space.
• Heap files: Unordered collections of pages. • Functions include:
• Indexes: Used for fast data retrieval. • Allocating and deallocating pages.
• Organizes information within pages. • Reading and writing data on disk.
Structure of a DBMS
4. Concurrency and Recovery
– Ensures reliable and consistent database operations in
multi-user environments.
1. Transaction Manager:
• Manages transactions and ensures proper locking
protocols are followed.
• Coordinates the scheduling of transactions.
2. Lock Manager:
• Tracks and grants locks on database objects to ensure no
conflicts arise during concurrent access.
3. Recovery Manager:
• Maintains a log of all changes for crash recovery.
• Restores the database to a consistent state after failures.
Structure of a DBMS
5. Interaction Between Layers
– Higher Layers:
• Focus on query parsing,
optimization, and execution.
– Lower Layers:
• Handle storage, data access, and
memory management.
– Concurrency Components:
• Ensure smooth multi-user operations
and crash recovery.
Structure of a DBMS
▪ Example Workflow
▪ Query Execution:
– A user issues a query
(e.g., SELECT * FROM Students WHERE age > 20).
– The query is parsed, optimized, and an execution plan is generated.
– Relational operators retrieve data via file and access methods.
– The buffer manager fetches necessary pages from disk into memory.
▪ Concurrency Management:
– Multiple users accessing the same table are coordinated using locks.
– Changes are logged for recovery in case of a crash.
Unit-1:
• Database System Applications: ▪ Introduction to Database Design:
• A Historical Perspective, – Database Design and ER Diagrams,
• File Systems versus a DBMS, – Entities, Attributes, and Entity Sets,
• The Data Model, – Relationships and Relationship Sets,
• Levels of Abstraction in a DBMS, – Additional Features of the ER Model,
• Data Independence, – Conceptual Design With the ER Model.
• Structure of a DBMS
Introduction to Database Design: Topics
▪ Database Design and ER Diagrams ▪ Conceptual Design With the ER
Model
– Beyond ER Design
– Entity versus Attribute
▪ Entities, Attributes, and Entity Sets
– Entity versus Relationship
▪ Relationships and Relationship Sets
– Binary versus Ternary
▪ Additional Features of the ER Model Relationships
– Key Constraints – Aggregation versus Ternary
– Participation Constraints Relationships
– Weak Entities
– Class Hierarchies
– Aggregation
Unit-1
Part-2 By B. Lokesh Joel
35