KEMBAR78
Dbms Notes | PDF | Database Index | Databases
0% found this document useful (0 votes)
66 views14 pages

Dbms Notes

The document outlines the purpose and characteristics of database systems, emphasizing efficient data management, integrity, and security. It details various data models, database management systems (DBMS), and architectures, along with concepts such as data independence and entity-relationship modeling. Additionally, it covers relational data models, update anomalies, keys, integrity constraints, and the importance of indexing for optimized data retrieval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views14 pages

Dbms Notes

The document outlines the purpose and characteristics of database systems, emphasizing efficient data management, integrity, and security. It details various data models, database management systems (DBMS), and architectures, along with concepts such as data independence and entity-relationship modeling. Additionally, it covers relational data models, update anomalies, keys, integrity constraints, and the importance of indexing for optimized data retrieval.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

1.

Purpose of a Database System


The purpose of a database system is to:
 Store, retrieve, and manage data efficiently.
 Maintain data integrity and security.
 Support concurrent access by multiple users.
 Minimize data redundancy and inconsistency.
 Provide mechanisms for backup and recovery.
 Enable data sharing among multiple users or systems.

2. Characteristics of the Database Approach


 Self-Describing Nature: The database system contains not only the data but also
metadata (data about data) to define its structure.
 Data Abstraction: Hides the complexity of data representation and storage through
various abstraction levels.
 Support for Multiple Views: Users can access customized views of the data according
to their requirements.
 Data Sharing and Multi-user Transaction Processing: Supports concurrency, ensuring
consistency and isolation for multiple transactions.
 Minimized Data Redundancy: Centralized management reduces duplication and
inconsistencies.

3. Data Models
A data model defines the logical structure of a database, including the relationships
between different data elements. Common types include:
 Hierarchical Model: Data is organized in a tree-like structure.
 Network Model: Data is represented as records connected by links.
 Relational Model: Data is organized into tables (relations) with rows and columns.
 Entity-Relationship Model (ER): Focuses on entities, attributes, and relationships.
 Object-Oriented Model: Data is represented as objects, like in object-oriented
programming.
4. Database Management System (DBMS)
A DBMS is software that:
 Provides an interface for users and applications to interact with a database.
 Handles tasks like data storage, retrieval, security, and concurrency.
 Examples: MySQL, Oracle, Microsoft SQL Server, PostgreSQL.

5. Database System Architecture


 Centralized Architecture: A single central database for all users.
 Client-Server Architecture: The database is hosted on a server, and clients access it
over a network.
 Distributed Architecture: Data is stored across multiple physical locations.

6. Three-Schema Architecture
 External Schema: Represents user views tailored to specific needs.
 Conceptual Schema: Represents the logical structure of the entire database, hiding
physical details.
 Internal Schema: Represents the physical storage structure and access paths.
The goal of this architecture is to achieve data abstraction and data independence.

7. Components of a DBMS
1. Storage Manager: Manages disk space and data structures.
2. Query Processor: Translates and executes database queries.
3. Transaction Manager: Ensures data consistency and handles concurrency.
4. Metadata Manager: Maintains information about the data.
5. Database Engine: Core service for data processing and management.
6. User Interfaces: Includes APIs and tools for interaction.

8. Data Independence
 Logical Data Independence: Ability to change the logical schema without affecting
applications.
 Physical Data Independence: Ability to change the physical storage without altering
the logical schema.

9. File System Approach vs. Database System Approach

Feature File System Approach Database System Approach

High redundancy due to lack of Minimized redundancy with centralized


Data Redundancy
integration. management.

Data Integrity Difficult to maintain. Enforced through constraints and rules.

Strong concurrency control


Concurrency Limited or absent.
mechanisms.

Advanced security features like user


Data Security Limited.
roles.

Data Sharing Limited sharing. Easy and efficient sharing across users.

Backup and
Manual and cumbersome. Automated and efficient.
Recovery

Scalability Difficult to scale. Scalable with distributed databases.

Entity-Relationship (ER) Modeling


ER Modeling is a conceptual approach to data modeling that defines the structure and
relationships of data within a domain. It provides a high-level view of the system's data
requirements and relationships in the form of an Entity-Relationship Diagram (ERD).

1. Conceptual Data Modeling: Motivation


 Purpose:
o To create a blueprint for the database system that captures the essential
structure of the data and its relationships.
o To facilitate communication between stakeholders (business users,
developers, and database designers).
o To ensure the data model reflects business rules and minimizes redundancy.
 Benefits:
o Provides a clear, visual representation of data.
o Helps identify data integrity constraints.
o Serves as a basis for database design.

2. Key Concepts in ER Modeling


a. Entities
 Definition: Entities represent "things" or "objects" in the real world that have distinct
identities.
 Examples: Employee, Student, Car, Product.
b. Entity Types
 Definition: A collection of entities with the same attributes.
 Examples:
o Student is an entity type that includes individual entities like John or Maria.
o Represented as a rectangle in ER diagrams.
c. Attributes
 Definition: Properties or characteristics of an entity or relationship.
 Types of Attributes:
o Simple: Cannot be divided further (e.g., Name, Age).
o Composite: Can be broken down into smaller parts (e.g., Full Name → First
Name + Last Name).
o Derived: Computed from other attributes (e.g., Age from Date of Birth).
o Multivalued: Can have multiple values (e.g., Phone Numbers).
 Represented as ovals in ER diagrams.
d. Relationships
 Definition: Associations between entities.
 Examples:
o Student enrolls in Course.
o Represented as a diamond in ER diagrams.
e. Relationship Types
 Definition: A set of similar relationships.
 Examples:
o "Enrolled-In" is a relationship type between Students and Courses.

3. Constraints on Relationships
a. Cardinality
Specifies the number of instances of one entity related to the other:
 One-to-One (1:1): One entity instance is related to one instance of another entity.
o Example: Each Employee has one Desk.
 One-to-Many (1:N): One entity instance is related to many instances of another
entity.
o Example: A Customer places many Orders.
 Many-to-Many (M:N): Many instances of one entity are related to many instances of
another entity.
o Example: Students enroll in many Courses, and each Course has many
Students.
b. Participation
Specifies whether all entities participate in the relationship:
 Total Participation: Every instance of an entity must participate in at least one
relationship instance.
o Represented by a double line.
 Partial Participation: Some instances of an entity may not participate.
o Represented by a single line.
c. Keys
 Primary Key: A unique attribute (or combination) that identifies an entity instance.
o Example: Student ID for a Student.

4. Entity-Relationship Diagram (ERD) Notation


ER diagrams use specific symbols to represent entities, relationships, and attributes:

Component Symbol Description

Entity Rectangle Represents an entity type.

Weak Entity Double Rectangle Depends on a strong entity for identification.


Component Symbol Description

Represents a property of an entity or


Attribute Oval
relationship.

Key Attribute Oval (underlined) An attribute uniquely identifying an entity.

Derived Attribute Dashed Oval Computed from other attributes.

Relationship Diamond Represents a relationship type between entities.

1, N, M written near
Cardinality Specifies the number of related instances.
lines

Participation Single or double lines Indicates partial or total participation.

Example ERD
Scenario: A Student enrolls in Courses taught by Professors.
 Entities: Student, Course, Professor.
 Attributes:
o Student: Student_ID (PK), Name, Age.
o Course: Course_ID (PK), Title, Credits.
o Professor: Professor_ID (PK), Name, Department.
 Relationships:
o Enrolls: Between Student and Course (Many-to-Many).
o Teaches: Between Professor and Course (One-to-Many).
ERD Notation:
 Student → Rectangle with attributes (Student_ID, Name, Age).
 Course → Rectangle with attributes (Course_ID, Title, Credits).
 Professor → Rectangle with attributes (Professor_ID, Name, Department).
 Enrolls → Diamond connecting Student and Course with M:N cardinality.
 Teaches → Diamond connecting Professor and Course with 1:N cardinality.
Relational Data Model
The Relational Data Model organizes data into relations (tables) consisting of rows (tuples)
and columns (attributes). It provides a formal foundation for structuring, querying, and
manipulating data.
1. Update Anomalies
Anomalies arise when updating data in poorly designed tables, often due to redundancy:
 Insertion Anomaly:
o Occurs when certain data cannot be added to the database without adding
unrelated data.
o Example: You can't add a new course to a student-course table unless a
student is enrolled in that course.
 Deletion Anomaly:
o Occurs when deleting certain data unintentionally removes other related
data.
o Example: Deleting the last student enrolled in a course removes the course
details.
 Update Anomaly:
o Occurs when changes in data require multiple updates, leading to
inconsistencies.
o Example: If a professor's department changes, it must be updated in all rows
where the professor appears.
These anomalies are resolved by normalizing the database.

2. Concept of Relations
 Relation:
o A table with rows and columns where each row is a tuple, and each column is
an attribute.
o Relations must satisfy certain properties:
 No duplicate tuples.
 Attributes have unique names.
 Order of rows and columns does not matter.
 Relation Schema:
o The structure of a relation, including its name, attributes, and their domains.
o Example: Student(Student_ID, Name, Age).
 Relation Instance:
o A specific set of tuples for a given schema at a particular moment.

3. Schema-Instance Distinction
 Schema:
o The overall structure and design of the database (metadata).
o Example: Table definitions, attribute names, and data types.
o Static in nature.
 Instance:
o The actual data stored in the database at a specific time.
o Dynamic and changes frequently as the data is updated.

4. Keys
Keys uniquely identify tuples within a relation.
 Superkey:
o A set of attributes that uniquely identify tuples in a relation.
o Example: {Student_ID}, {Student_ID, Name}.
 Candidate Key:
o A minimal superkey, meaning no subset can also uniquely identify tuples.
o Example: {Student_ID}.
 Primary Key:
o A chosen candidate key used to uniquely identify tuples.
o Example: Student_ID.
 Alternate Key:
o A candidate key not chosen as the primary key.
o Example: {Email} if {Email} is unique.
 Composite Key:
o A primary key made of multiple attributes.
o Example: {Student_ID, Course_ID} in an enrollment table.
5. Relational Integrity Constraints
Constraints ensure data accuracy and consistency.
 Domain Constraints:
o Attribute values must belong to a specific domain.
o Example: Age must be an integer between 0 and 120.
 Entity Integrity:
o Primary key attributes cannot be null.
o Example: Every Student must have a unique Student_ID.
 Referential Integrity:
o Ensures foreign keys reference valid primary key values in another table.
o Example: Course_ID in the Enrollment table must exist in the Courses table.

6. Referential Integrity and Foreign Keys


 Foreign Key:
o An attribute in one relation that references the primary key of another
relation.
o Maintains referential integrity by ensuring the referenced data exists.
Example:
 Courses(Course_ID, Title) → Primary Key: Course_ID.
 Enrollments(Student_ID, Course_ID) → Foreign Key: Course_ID references
Courses(Course_ID).

7. Relational Algebra Operators and Queries


Relational algebra is a procedural query language that provides operations to manipulate
relations.
a. Unary Operators
 Selection (σ):
o Filters rows based on a condition.
o Example: σ(Age > 18)(Student) → Selects students older than 18.
 Projection (π):
o Selects specific columns.
o Example: π(Name, Age)(Student) → Displays only Name and Age.
 Rename (ρ):
o Renames a relation or its attributes.
o Example: ρ(NewStudent, Student) → Renames Student to NewStudent.
b. Binary Operators

 Union (∪):
o Combines tuples from two relations, removing duplicates.

o Example: R1 ∪ R2.
 Set Difference (-):
o Finds tuples in one relation but not in another.
o Example: R1 - R2.
 Intersection (∩):
o Finds tuples common to both relations.
o Example: R1 ∩ R2.
 Cartesian Product (×):
o Combines every tuple of one relation with every tuple of another.
o Example: R1 × R2.
c. Relational Join

 Theta Join (⨝θ):


o Combines tuples based on a condition θ.

o Example: Student ⨝Student.Student_ID = Enrollment.Student_ID Enrollment.

 Natural Join (⨝):


o Automatically joins relations on attributes with the same name.
d. Division
 Finds tuples in one relation that match all tuples in another relation.
 Example: R1 ÷ R2.

8. Example Relational Algebra Query


Retrieve the names of students enrolled in the "Database" course:
1. Select the Course_ID for "Database" from the Courses relation:
2. σ(Title = "Database")(Courses)
3. Join the result with the Enrollments table to find relevant students:

4. Courses ⨝ Courses.Course_ID = Enrollments.Course_ID Enrollments


5. Join with the Student table to retrieve names:

6. (Courses ⨝ Enrollments) ⨝ Student


7. Project the Name attribute:

8. π(Name)((Courses ⨝ Enrollments) ⨝ Student)


Data Storage and Indexes
Efficient storage and retrieval of data are critical in database systems. Indexing and file
organization are key components that improve performance.

1. Need for File Indexes


File indexes act like a "table of contents" for a database, speeding up data retrieval.
 Why Indexes Are Needed:
o Fast Search: Quickly locate records without scanning the entire table.
o Efficient Updates: Reduce the cost of finding and modifying data.
o Sorting: Facilitate ordered retrieval of data.
o Optimization: Improve performance of queries with WHERE, JOIN, or ORDER
BY.
 Example: Searching for a specific student in a table of 1 million rows without an
index would require a full table scan. An index reduces this to a logarithmic search
time.

2. File Organizations
The arrangement of data on storage determines access speed and efficiency.
Types of File Organizations:
 Heap (Unordered Files):
o Records are stored as they arrive.
o Efficient for bulk inserts but slow for search.
 Sequential:
o Records are stored in a sorted order.
o Efficient for range queries but costly for inserts and updates.
 Clustered:
o Related rows are stored physically close together.
o Optimized for queries involving joins or related data.
 Hashed:
o Records are stored based on a hash function.
o Provides fast equality search but poor range query performance.

3. Index Structures
Indexes are auxiliary data structures that speed up searches.
Types of Indexes:
 Primary Index:
o Built on a table's primary key.
o Ensures that each key corresponds to exactly one record.
 Secondary Index:
o Built on non-primary attributes.
o Useful for optimizing queries on frequently searched columns.
 Unique Index:
o Enforces uniqueness on indexed columns.
Index Data Structures:
 Single-Level Indexing:
o A single layer of indexes pointing to records.
o Example: Dense Index (one index entry per record) and Sparse Index (index
entry for each block).
 Multi-Level Indexing:
o Indexes on indexes for scalability with large datasets.
o Example: Two-level or three-level indexing.
4. Concurrent Execution of Transactions
Multiple transactions run simultaneously in a database system to improve performance and
resource utilization.
Challenges:
 Data Consistency:
o Simultaneous transactions can lead to conflicts (e.g., dirty reads, lost
updates).
 Deadlocks:
o Two or more transactions waiting for each other to release locks.
Solutions:
 Locking Protocols:
o Shared and exclusive locks ensure data integrity.
 Timestamp Ordering:
o Ensures transaction execution follows a logical order based on timestamps.

5. ACID Properties
ACID ensures reliability and integrity of transactions in databases.
A: Atomicity:
 A transaction is all-or-nothing; either all its operations succeed, or none are applied.
 Example: Transferring money between accounts must debit and credit both accounts
or none.
C: Consistency:
 A transaction leaves the database in a valid state, adhering to all constraints.
 Example: Ensuring no two rows have the same primary key.
I: Isolation:
 Concurrent transactions do not interfere with each other.
 Example: One transaction updating a record cannot see changes made by another
transaction until committed.
D: Durability:
 Once a transaction is committed, its changes are permanent, even in case of system
failure.
 Achieved through logs and backup mechanisms.

Example Scenario
Use Case: Retrieving student records sorted by age.
1. File Organization:
o Use sequential file organization to store records sorted by age.
2. Indexing:
o Create a secondary index on the Age column for faster queries.
3. Concurrent Transactions:
o If two users query and update the Age column simultaneously, locking
ensures isolation.
4. ACID Compliance:
o Atomicity guarantees both read and update queries complete or fail together.
o Durability ensures updates persist even after a system crash.

Advantages of Indexing
 Faster data retrieval.
 Improved query performance for large datasets.
 Reduced I/O overhead for accessing data.

You might also like