UNIVERSITY INSTITUTE OF
ENGINEERING
Department of Computer Science & Engineering
Databases Management System
(23CSH-205/23ITH-205)
DR. GITANJALI
E16525
ASSISTANT PROFESSOR DISCOVER . LEARN . EMPOWER
CSE
1
ABOUT COURSE
• The main DBMS course, we learn about the structural formation of data, maintain data
integrity, multitasking with concurrent access and recovery without occurring crashes,
data structures, data models etc. and their working in which every organization is based.
• At the end of the course, the students may understand the concepts of data structures,
data models and design, construction of queries by using SQL, uses and applications of
database design etc
COURSE OUTCOMES
CO No. Title BT Level
Will be covered in this
lecture
Students will be able to understand the database system
CO1
concepts and design databases for different applications.
Memorize
Students will be able to identify different types of DDL, DML,
CO2 DCL and TCL commands and their usage. Understanding
Students will be able to classifying Normalization,
CO3 Dependencies and Denormalization along with their Apply
requirements.
Students will be able to analyse and Compare Transaction
CO4
Processing techniques and Recovery techniques.
Analyze
Students will be able to evaluate database performance after
CO5 implementing Triggers, procedures, packages, cursors and Evaluate
views.
COURSE OBJECTIVES
This course will enable students to,
1. To have good understanding of database system concepts and design databases for
different applications.
2. To learn how to use a DBMS and RDBMS.
3. To implement and understand different types of DDL, DML and DCL statements.
4. To understand transaction concepts related to databases and recovery/backup
techniques required for the proper storage of data.
Topics covered
Database Recovery of database:
Introduction, Need for Recovery, Types of errors, Recovery Techniques.
Introduction to Database Recovery
Database recovery is the process of restoring a database to a correct state in the event
of a failure. Failures can occur due to various reasons, such as hardware crashes,
software bugs, power outages, or human errors. Recovery ensures the consistency,
integrity, and durability of the database, as defined by the ACID properties.
Why Recovery is Required in Database?
System failures: The DBMS can experience various types of failures, such as hardware
failures, software bugs, or power outages, which can lead to data corruption or loss.
Recovery mechanisms can help restore the database to a consistent state after such failures.
Transaction failures: Transactions can fail due to various reasons, such as network failures,
deadlock, or errors in application logic. Recovery mechanisms can help roll back or undo
the effects of such failed transactions to ensure data consistency.
Human errors: Human errors such as accidental deletion, updating or overwriting data, or
incorrect data entry can cause data inconsistencies. Recovery mechanisms can help recover
the lost or corrupted data and restore it to the correct state.
Security breaches: Security breaches such as hacking or unauthorized access can
compromise the integrity of data. Recovery mechanisms can help restore the database to a
consistent state and prevent further data breaches.
Database Recovery
• Hardware upgrades: When a DBMS is upgraded to a new hardware system, the
migration process can potentially lead to data loss or corruption. Recovery mechanisms
can help ensure that the data is successfully migrated and the integrity of the database is
maintained.
• Natural disasters: Natural disasters such as earthquakes, floods, or fires can damage the
hardware on which the database is stored, leading to data loss. Recovery mechanisms can
help restore the data from backups and minimize the impact of the disaster.
• Compliance regulations: Many industries have regulations that require businesses to
retain data for a certain period of time. Recovery mechanisms can help ensure that the
data is available for compliance purposes even if it was deleted or lost accidentally.
• Data corruption: Data corruption can occur due to various reasons such as hardware
failure, software bugs, or viruses. Recovery mechanisms can help restore the database to a
consistent state and recover any lost or corrupted data.
Need for Recovery
The need for database recovery arises in the following
scenarios:
1. System Crashes: When the system fails unexpectedly
due to hardware or software issues.
2. Media Failures: Physical damage to storage devices
(e.g., disk crashes).
3. Transaction Failures: Improper termination of
transactions due to logical errors or deadlocks.
4. Natural Disasters: Events such as floods, fires, or
earthquakes that damage the database infrastructure.
5. Human Errors: Accidental deletion or modification of
critical data.
Types of Errors
1. Transaction Errors: Occur due to incorrect logic, violation of
constraints, or system aborts.
2. System Errors: Result from issues in the database system, such as
deadlocks or crashes.
3. Media Errors: Caused by hardware malfunctions, such as disk
failures.
4. Natural Disasters: Unpredictable events that affect the physical
infrastructure.
Types of Recovery Techniques
Database recovery techniques are used in database management systems (DBMS) to
restore a database to a consistent state after a failure or error has occurred. The main
goal of recovery techniques is to ensure data integrity and consistency and prevent
data loss.
There are mainly two types of recovery techniques used in DBMS
• Rollback/Undo Recovery Technique
• Commit/Redo Recovery Technique
• Checkpoint Recovery Technique
Recovery
Rollback/Undo Recovery Technique
The rollback/undo recovery technique is based on the principle of backing out or
undoing the effects of a transaction that has not been completed successfully due
to a system failure or error. This technique is accomplished by undoing the
changes made by the transaction using the log records stored in the transaction
log. The transaction log contains a record of all the transactions that have been
performed on the database. The system uses the log records to undo the changes
made by the failed transaction and restore the database to its previous state.
Database Recovery
Commit/Redo Recovery Technique
The commit/redo recovery technique is based on the principle of reapplying the
changes made by a transaction that has been completed successfully to the database.
This technique is accomplished by using the log records stored in the transaction log
to redo the changes made by the transaction that was in progress at the time of the
failure or error. The system uses the log records to reapply the changes made by the
transaction and restore the database to its most recent consistent state.
Commit
Checkpoint Recovery Technique
Checkpoint Recovery is a technique used to improve data integrity and system
stability, especially in databases and distributed systems. It entails preserving the
system’s state at regular intervals, known as checkpoints, at which all ongoing
transactions are either completed or not initiated. This saved state, which includes
memory and CPU registers, is kept in stable, non-volatile storage so that it can
withstand system crashes. In the event of a breakdown, the system can be restored to
the most recent checkpoint, which reduces data loss and downtime. The frequency of
checkpoint formation is carefully regulated to decrease system overhead while
ensuring that recent data may be restored quickly.
Why do We Need Checkpoints?
Whenever transaction logs are created in a real-time environment, it eats up lots
of storage space. Also keeping track of every update and its maintenance may
increase the physical space of the system. Eventually, the transaction log file may
not be handled as the size keeps growing. This can be addressed with
checkpoints. The methodology utilized for removing all previous transaction logs
and storing them in permanent storage is called a Checkpoint.
Steps to Use Checkpoints in the Database
1. Write the begin_checkpoint record into a log.
2. Collect checkpoint data in stable storage.
3. Write the end_checkpoint record into a log.
The behavior when the system crashes and
recovers when concurrent transactions are
executed is shown below:
Transactions and operations of the above diagram:
The recovery system reads the logs backward from
the end to the last checkpoint i.e. from T4 to T1.
It will keep track of two lists – Undo and Redo.
Whenever there is a log with instructions <Tn,
start>and <Tn, commit> or only <Tn, commit> then
it will put that transaction in Redo List. T2 and T3
contain <Tn, Start> and <Tn, Commit> whereas T1
will have only <Tn, Commit>. Here, T1, T2, and T3
are in the redo list.
Whenever a log record with no instruction of
commit or abort is found, that transaction is put to
Undo List <Here, T4 has <Tn, Start> but no <Tn, co
Types of Checkpoints
There are basically two main types of Checkpoints:
1. Automatic Checkpoint
2. Manual Checkpoint
1. Automatic Checkpoint: These checkpoints occur very frequently like every hour
or every day. These intervals are set by the database administrator. They are generally
used by heavy databases as they are frequently updated, and we can recover the data
easily in case of failure.
2. Manual Checkpoint: These are the checkpoints that are manually set by the
database administrator. Manual checkpoints are generally used for smaller databases.
They are updated very less frequently only when they are set by the database
administrator.
Advantages of Checkpoints
• Checkpoints help us in recovering the transaction of the database in case of a random
shutdown of the database.
• It enhancing the consistency of the database in case when multiple transactions are executing
in the database simultaneously.
• It increasing the data recovery process.
• Checkpoints work as a synchronization point between the database and the transaction log file
in the database.
• Checkpoint records in the log file are used to prevent unnecessary redo operations.
• Since dirty pages are flushed out continuously in the background, it has a very low overhead
and can be done frequently.
• Checkpoints provide the baseline information needed for the restoration of the lost state in the
event of a system failure.
• A database checkpoint keeps track of change information and enables incremental database
backup.
Disadvantages of Checkpoints
1. Database storage checkpoints can only be used to restore from
logical errors (E.g. a human error).
2. Because all the data blocks are on the same physical device,
database storage checkpoints cannot be used to restore files due to a
media failure.
Shadow Paging
Shadow paging is a database recovery technique that ensures atomicity and
durability without the need for a log. It works by maintaining two versions of the
database: the current page table and the shadow page table.
How it Works:
Shadow Page Table: A copy of the database page table is created before any updates
are made. This shadow page table remains unchanged during the transaction.
Current Page Table: Updates are made to the current page table, pointing to the
updated data pages. The original data pages remain intact until the transaction
commits.
Commit Process: Once the transaction commits, the shadow page table is replaced
with the current page table.
Failure Handling: If a failure occurs, the shadow page table is used to restore the
database to its previous state.
Advantages:
• No logging is required.
• Provides fast recovery by simply reverting
to the shadow page table.
Disadvantages:
• High overhead due to duplication of
pages.
• Limited scalability for large databases.
Deferred Update
Deferred update is a recovery technique where changes made by a transaction are not
immediately applied to the database. Instead, they are stored in a log or temporary buffer and
applied only after the transaction commits.
How it Works:
1. Changes are recorded in a log but not written to the database immediately.
2. When the transaction commits, all changes are applied to the database in one go.
3. If a failure occurs before the transaction commits, no changes are applied.
Advantages:
• Simplifies recovery, as uncommitted changes are not applied.
• Ensures data consistency, as only committed transactions modify the database.
Disadvantages:
• Increased delay in applying changes.
• Higher memory usage to store uncommitted changes.
Immediate Update
Immediate update is a recovery technique where changes made by a transaction are applied to
the database as soon as they occur, even before the transaction commits.
How it Works:
1. Changes are written to the database immediately after they are performed.
2. Logs are maintained for both undo and redo operations.
⚬ Undo Log: Reverts changes if the transaction aborts.
⚬ Redo Log: Reapplies changes if the system crashes after a transaction commits.
Advantages:
• Faster processing as changes are immediately visible.
• Reduces delay in making data available.
Disadvantages:
• Complex recovery process, as both undo and redo operations may be required.
• Risk of data inconsistency if a failure occurs during a transaction.
ARIES (Algorithm for Recovery and
Isolation Exploiting Semantics)
ARIES is a widely used recovery algorithm that ensures database consistency using a combination of Write-
Ahead Logging (WAL), checkpoints, and log analysis.
Key Features:
1. Write-Ahead Logging (WAL):
⚬ Logs are written before any changes are made to the database.
⚬ Ensures that no changes are permanent until they are logged.
2. Three Phases of Recovery:
⚬ Analysis Phase: Scans the logs to identify dirty pages (modified but not written to disk) and active
transactions.
⚬ Redo Phase: Reapplies changes of committed transactions to ensure durability.
⚬ Undo Phase: Rolls back changes of uncommitted transactions to ensure atomicity.
3. Checkpoints:
⚬ Periodic checkpoints are created to minimize recovery time.
⚬ Reduces the number of logs that need to be analyzed during recovery.
4. Transaction Isolation:
⚬ ARIES uses locking mechanisms to ensure transaction
isolation during recovery.
Advantages:
• Efficient and robust for large-scale systems.
• Handles both committed and uncommitted transactions.
• Reduces recovery time with checkpoints.
Disadvantages:
ARIES
• Complex to implement and manage.
• High overhead for maintaining logs and checkpoints.
Comparison of Techniques
Backup Techniques
• Full database Backup: In this full database including data and database, Meta
information needed to restore the whole database, including full-text catalogs are
backed up in a predefined time series.
• Differential Backup: It stores only the data changes that have occurred since the
last full database backup. When some data has changed many times since the last
full database backup, a differential backup stores the most recent version of the
changed data. For this first, we need to restore a full database backup.
• Transaction Log Backup: In this, all events that have occurred in the database,
like a record of every single statement executed is backed up. It is the backup of
transaction log entries and contains all transactions that had happened to the
database. Through this, the database can be recovered to a specific point in time.
It is even possible to perform a backup from a transaction log if the data files are
destroyed and not even a single committed transaction is lost.
THANK YOU