KEMBAR78
DBMS Transaction Essentials | PDF | Database Transaction | Acid
0% found this document useful (0 votes)
39 views13 pages

DBMS Transaction Essentials

Uploaded by

kartikyadav1120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views13 pages

DBMS Transaction Essentials

Uploaded by

kartikyadav1120
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Transaction Management

Transactions are a set of operations used to perform a logical set of work. It is the bundle of all the
instructions of a logical operation. A transaction usually means that the data in the database has
changed. One of the major uses of DBMS is to protect the user’s data from system failures. It is done by
ensuring that all the data is restored to a consistent state when the computer is restarted after a crash.
The transaction is any one execution of the user program in a DBMS. One of the important properties of
the transaction is that it contains a finite number of steps. Executing the same program multiple times
will generate multiple transactions.

Example: Consider the following example of transaction operations to be performed to withdraw cash
from an ATM vestibule.

Steps for ATM Transaction

1. Transaction Start.

2. Insert your ATM card.

3. Select a language for your transaction.

4. Select the Savings Account option.

5. Enter the amount you want to withdraw.

6. Enter your secret pin.

7. Wait for some time for processing.

8. Collect your Cash.

9. Transaction Completed.

A transaction can include the following basic database access operation.

 Read/Access data (R): Accessing the database item from disk (where the database stored data)
to memory variable.

 Write/Change data (W): Write the data item from the memory variable to the disk.

 Commit: Commit is a transaction control language that is used to permanently save the changes
done in a transaction

Example: Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B= 800₹. This data is brought to
RAM from Hard Disk.

R(A) -- 500 // Accessed from RAM.


A = A-50 // Deducting 50₹ from A.
W(A)--450 // Updated in RAM.
R(B) -- 800 // Accessed from RAM.
B=B+50 // 50₹ is added to B's Account.
W(B) --850 // Updated in RAM.
commit // The data in RAM is taken back to Hard Disk.

Stages of Transaction

Note: The updated value of Account A = 450₹ and Account B = 850₹.


All instructions before committing come under a partially committed state and are stored in RAM. When
the commit is read the data is fully accepted and is stored on a Hard Disk.
If the transaction is failed anywhere before committing we have to go back and start from the beginning.
We can’t continue from the same state. This is known as Roll Back.

Desirable Properties of Transaction (ACID Properties)


Transaction management in a Database Management System (DBMS) ensures that database transactions
are executed reliably and follow ACID properties: Atomicity, Consistency, Isolation, and Durability.
These principles help maintain data integrity, even during failures or concurrent user interactions,
ensuring that all transactions are either fully completed or rolled back if errors occur.

For a transaction to be performed in DBMS, it must possess several properties often called ACID
properties.

 A – Atomicity

 C – Consistency

 I – Isolation

 D – Durability
Transaction States
Transactions can be implemented using SQL queries and Servers. In the diagram, you can see
how transaction states work.

Transaction States

The transaction has four properties. These are used to maintain consistency in a database, before and
after the transaction.
Property of Transaction:

 Atomicity

 Consistency

 Isolation

 Durability

Atomicity

 States that all operations of the transaction take place at once if not, the transactions are
aborted.

 There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated as one
unit and either run to completion or is not executed at all.

 Atomicity involves the following two operations:

 Abort: If a transaction aborts, then all the changes made are not visible.

 Commit : If a transaction commits then all the changes made are visible.
Consistency

 The integrity constraints are maintained so that the database is consistent before and after the
transaction.

 The execution of a transaction will leave a database in either its prior stable state or anew stable
state.

 The consistent property of database states that every transaction sees a consistent database
instance.

 The transaction is used to transform the database from one consistent state to another
consistent state.

Isolation

 It shows that the data which is used at the time of execution of a transaction cannot be used by
the second transaction until the first one is completed.

 In isolation, if the transaction T1 is being executed and using the data item X, then that data item
can’t be accessed by any other transaction T2 until the transaction T1ends.

 The concurrency control subsystem of the DBMS enforced the isolation property

Durability

 The durability property is used to indicate the performance of the database’s consistent state. It
states that the transaction made the permanent changes.
 They cannot be lost by the erroneous operation of a faulty transaction or by the system failure.
When a transaction is completed, then the database reaches a state known as the consistent
state. That consistent state cannot be lost, even in the event of a system’s failure.

 The recovery subsystem of the DBMS has the responsibility of Durability property.

What is Serializability in DBMS?


In computer science, Serializability is a term that characterizes a system property governing how distinct
processes interact with shared data. If the outcomes produced by the system align with the operations it
performs, we deem that system as Serializable. In this context, cooperation within the system implies a
need for more overlap in data execution. In Database Management Systems (DBMS), when data is either
written or read, the DBMS can prevent other processes from accessing that data.
DBMS utilizes various locking mechanisms within the MongoDB developer certificate to permit
concurrent processes while upholding data integrity. In MongoDB, the most stringent level of
Serializability is constrained by two-phase locking, commonly known as 2PL. In the initial phase of this
locking system, data objects are locked before the execution of operations. Once the transaction is
completed, the lock on the data object is released. This process ensures conflict-free functions,
presenting a unified database view of all transactions.

The Two-Phase Locking (2PL) system furnishes a robust assurance against conflicts in the database. It can
alleviate performance degradation by managing the overhead of acquiring and releasing data locks.
Consequently, this system facilitates constrained Serialisability, enhancing the overall performance of the
DBMS. This guarantees that the result aligns with some sequential execution, optimising the database
operations.

What is a Serializable schedule?


In a Database Management System (DBMS), a Serializable schedule is a sequence of database actions
(read and write operations) that does not violate the serializability property. This property ensures that
each transaction appears to execute atomically and is isolated from other transactions' effects. The
Serializability of schedules in DBMS must be equivalent to some serial schedule of the duplicate
transactions. In other words, the program must produce the same results as if the transactions were
executed one at a time in some order.

Several algorithms can be used to check for a Serializable schedule in DBMS. One such algorithm is the
conflict Serializability algorithm, which checks for potential transaction conflicts. A conflict occurs when
two transactions try to access the same data item in conflicting ways (e.g., one transaction tries to read a
data item while another transaction writes to it). If there are no conflicts, the schedule is guaranteed to
be Serializable. However, if there are conflicts, the program may or may not be Serializable.

Another algorithm that can check for Serializability is the view Serializability in the DBMS algorithm,
which checks for potential mutual dependencies between transactions. A mutual dependency exists
when two transactions depend on each other for their respective outputs to be correct. The schedule is
guaranteed to be Serializable if there are no mutual dependencies. However, if there are mutual
dependencies, the schedule may or may not be Serializable.

Different types of Serializability


The following are the different types of Serializability:

1) Conflict Serializability

Conflict Serializability pertains to a subset of Serializability, concentrating on upholding the coherence of


a database while ensuring the orderly execution of identical data items. Each transaction possesses a
distinct value in a Database Management System (DBMS), and the entire database relies on this
uniqueness. This distinctiveness ensures that concurrent occurrences of operations with the same
conflicting value are prohibited.

For instance, let's consider an order table and a customer table as two instances. Every order is linked to
a specific customer, even though a single client may place multiple orders. There are constraints to
achieving conflict Serializability in the database. Here are a few of them:
1) Different transactions must be utilised for the two procedures.

2) The identical data item should be present in both transactions.

3) Between the two operations, there should be at least one write operation.

2) View Serializability

View Serializability takes a more abstract perspective, considering the results visible to users rather than
the specific operations performed. It emphasizes the order in which transactions' results are presented,
providing a higher-level understanding of consistency.

View Serializability constitutes a category of operation within Serializability, where each transaction is
required to yield specific results, and these outcomes correspond to the results of correctly sequential
execution of the data item. Unlike conflict Serializability, view Serializability focuses on preventing
database inconsistency. The view Serializability functionality of a DBMS empowers users to perceive
databases in conflicting manners.

Concurrency Control in DBMS


 Executing a single transaction at a time will increase the waiting time of the other transactions
which may result in delay in the overall execution. Hence for increasing the overall throughput
and efficiency of the system, several transactions are executed.

 Concurrency control is a very important concept of DBMS which ensures the simultaneous
execution or manipulation of data by several processes or user without resulting in data
inconsistency.

 Concurrency control provides a procedure that is able to control concurrent execution of the
operations in the database.

 The fundamental goal of database concurrency control is to ensure that concurrent execution of
transactions does not result in a loss of database consistency. The concept of serializability can
be used to achieve this goal, since all serializable schedules preserve consistency of the
database. However, not all schedules that preserve consistency of the database are serializable.

 In general it is not possible to perform an automatic analysis of low-level operations by


transactions and check their effect on database consistency constraints. However, there are
simpler techniques. One is to use the database consistency constraints as the basis for a split of
the database into subdatabases on which concurrency can be managed separately.

 Another is to treat some operations besides read and write as fundamental low-level operations
and to extend concurrency control to deal with them.
Concurrency Control Problems
There are several problems that arise when numerous transactions are executed simultaneously in a
random manner. The database transaction consist of two major operations “Read” and “Write”. It is very
important to manage these operations in the concurrent execution of the transactions in order to
maintain the consistency of the data.

Dirty Read Problem(Write-Read conflict)

Dirty read problem occurs when one transaction updates an item but due to some unconditional events
that transaction fails but before the transaction performs rollback, some other transaction reads the
updated value. Thus creates an inconsistency in the database. Dirty read problem comes under the
scenario of Write-Read conflict between the transactions in the database

1. The lost update problem can be illustrated with the below scenario between two transactions T1
and T2.

2. Transaction T1 modifies a database record without committing the changes.

3. T2 reads the uncommitted data changed by T1

4. T1 performs rollback

5. T2 has already read the uncommitted data of T1 which is no longer valid, thus creating
inconsistency in the database.

Lost Update Problem

Lost update problem occurs when two or more transactions modify the same data, resulting in the
update being overwritten or lost by another transaction. The lost update problem can be illustrated with
the below scenario between two transactions T1 and T2.

1. T1 reads the value of an item from the database.

2. T2 starts and reads the same database item.

3. T1 updates the value of that data and performs a commit.

4. T2 updates the same data item based on its initial read and performs commit.

5. This results in the modification of T1 gets lost by the T2’s write which causes a lost update
problem in the database.
Concurrency Control Protocols
Concurrency control protocols are the set of rules which are maintained in order to solve the
concurrency control problems in the database. It ensures that the concurrent transactions can execute
properly while maintaining the database consistency. The concurrent execution of a transaction is
provided with atomicity, consistency, isolation, durability, and serializability via the concurrency control
protocols.

 Locked based concurrency control protocol

 Timestamp based concurrency control protocol

Locked based Protocol

In locked based protocol, each transaction needs to acquire locks before they start accessing or
modifying the data items. There are two types of locks used in databases.

 Shared Lock : Shared lock is also known as read lock which allows multiple transactions to read
the data simultaneously. The transaction which is holding a shared lock can only read the data
item but it can not modify the data item.

 Exclusive Lock : Exclusive lock is also known as the write lock. Exclusive lock allows a transaction
to update a data item. Only one transaction can hold the exclusive lock on a data item at a time.
While a transaction is holding an exclusive lock on a data item, no other transaction is allowed to
acquire a shared/exclusive lock on the same data item.

There are two kind of lock based protocol mostly used in database:

 Two Phase Locking Protocol : Two phase locking is a widely used technique which ensures strict
ordering of lock acquisition and release. Two phase locking protocol works in two phases.

o Growing Phase : In this phase, the transaction starts acquiring locks before performing
any modification on the data items. Once a transaction acquires a lock, that lock can not
be released until the transaction reaches the end of the execution.

o Shrinking Phase : In this phase, the transaction releases all the acquired locks once it
performs all the modifications on the data item. Once the transaction starts releasing
the locks, it can not acquire any locks further.

 Strict Two Phase Locking Protocol : It is almost similar to the two phase locking protocol the
only difference is that in two phase locking the transaction can release its locks before it
commits, but in case of strict two phase locking the transactions are only allowed to release the
locks only when they performs commits.
Timestamp based Protocol

 In this protocol each transaction has a timestamp attached to it. Timestamp is nothing but the
time in which a transaction enters into the system.

 The conflicting pairs of operations can be resolved by the timestamp ordering protocol through
the utilization of the timestamp values of the transactions. Therefore, guaranteeing that the
transactions take place in the correct order.

Database Recovery Techniques in DBMS


Database Systems like any other computer system, are subject to failures but the data stored in them
must be available as and when required. When a database fails it must possess the facilities for fast
recovery. It must also have atomicity i.e. either transactions are completed successfully and committed
(the effect is recorded permanently in the database) or the transaction should have no effect on the
database.

Types of Recovery Techniques in DBMS

Database recovery techniques are used in database management systems (DBMS) to restore a database
to a consistent state after a failure or error has occurred. The main goal of recovery techniques is to
ensure data integrity and consistency and prevent data loss.

There are mainly two types of recovery techniques used in DBMS

 Rollback/Undo Recovery Technique

 Commit/Redo Recovery Technique

 CheckPoint Recovery Technique

Rollback/Undo Recovery Technique

The rollback/undo recovery technique is based on the principle of backing out or undoing the effects of a
transaction that has not been completed successfully due to a system failure or error. This technique is
accomplished by undoing the changes made by the transaction using the log records stored in the
transaction log. The transaction log contains a record of all the transactions that have been performed
on the database. The system uses the log records to undo the changes made by the failed transaction
and restore the database to its previous state.

Commit/Redo Recovery Technique

The commit/redo recovery technique is based on the principle of reapplying the changes made by a
transaction that has been completed successfully to the database. This technique is accomplished by
using the log records stored in the transaction log to redo the changes made by the transaction that was
in progress at the time of the failure or error. The system uses the log records to reapply the changes
made by the transaction and restore the database to its most recent consistent state.

Checkpoint Recovery Technique

Checkpoint Recoveryis a technique used to improve data integrity and system stability, especially in
databases and distributed systems. It entails preserving the system’s state at regular intervals, known as
checkpoints, at which all ongoing transactions are either completed or not initiated. This saved state,
which includes memory and CPU registers, is kept in stable, non-volatile storage so that it can withstand
system crashes. In the event of a breakdown, the system can be restored to the most recent checkpoint,
which reduces data loss and downtime. The frequency of checkpoint formation is carefully regulated to
decrease system overhead while ensuring that recent data may be restored quickly.

Log based Recovery in DBMS


The atomicity property of DBMS states that either all the operations of transactions must be performed
or none. The modifications done by an aborted transaction should not be visible to the database and the
modifications done by the committed transaction should be visible. To achieve our goal of atomicity, the
user must first output stable storage information describing the modifications, without modifying the
database itself. This information can help us ensure that all modifications performed by committed
transactions are reflected in the database. This information can also help us ensure that no modifications
made by an aborted transaction persist in the database.

Log based Recovery in DBMS


Log and log records
The log is a sequence of log records, recording all the updated activities in the database. In stable
storage, logs for each transaction are maintained. Any operation which is performed on the database is
recorded on the log. Prior to performing any modification to the database, an updated log record is
created to reflect that modification. An update log record represented as: <Ti, Xj, V1, V2> has these
fields:

1. Transaction identifier: Unique Identifier of the transaction that performed the write operation.

2. Data item: Unique identifier of the data item written.

3. Old value: Value of data item prior to write.

4. New value: Value of data item after write operation.

Other types of log records are:

1. <Ti start> : It contains information about when a transaction Ti starts.

2. <Ti commit> : It contains information about when a transaction Ti commits.

3. <Ti abort> : It contains information about when a transaction Ti aborts.

Undo and Redo Operations

Because all database modifications must be preceded by the creation of a log record, the system has
available both the old value prior to the modification of the data item and new value that is to be written
for data item. This allows system to perform redo and undo operations as appropriate:

1. Undo: using a log record sets the data item specified in log record to old value.

2. Redo: using a log record sets the data item specified in log record to new value.

The database can be modified using two approaches –

1. Deferred Modification Technique: If the transaction does not modify the database until it has
partially committed, it is said to use deferred modification technique.

2. Immediate Modification Technique: If database modification occur while the transaction is still
active, it is said to use immediate modification technique.
Recovery using Log records

After a system crash has occurred, the system consults the log to determine which transactions need to
be redone and which need to be undone.

1. Transaction Ti needs to be undone if the log contains the record <Ti start> but does not contain
either the record <Ti commit> or the record <Ti abort>.

2. Transaction Ti needs to be redone if log contains record <Ti start> and either the record <Ti
commit> or the record <Ti abort>.

Use of Checkpoints – When a system crash occurs, user must consult the log. In principle, that need to
search the entire log to determine this information. There are two major difficulties with this approach:

1. The search process is time-consuming.

2. Most of the transactions that, according to our algorithm, need to be redone have already
written their updates into the database. Although redoing them will cause no harm, it will cause
recovery to take longer.

To reduce these types of overhead, user introduce checkpoints. A log record of the form <checkpoint L>
is used to represent a checkpoint in log where L is a list of transactions active at the time of the
checkpoint. When a checkpoint log record is added to log all the transactions that have committed
before this checkpoint have <Ti commit> log record before the checkpoint record. Any database
modifications made by Ti is written to the database either prior to the checkpoint or as part of the
checkpoint itself. Thus, at recovery time, there is no need to perform a redo operation on Ti. After a
system crash has occurred, the system examines the log to find the last <checkpoint L> record. The redo
or undo operations need to be applied only to transactions in L, and to all transactions that started
execution after the record was written to the log. Let us denote this set of transactions as T. Same rules
of undo and redo are applicable on T as mentioned in Recovery using Log records part. Note that user
need to only examine the part of the log starting with the last checkpoint log record to find the set of
transactions T, and to find out whether a commit or abort record occurs in the log for each transaction in
T. For example, consider the set of transactions {T0, T1, . . ., T100}. Suppose that the most recent
checkpoint took place during the execution of transaction T67 and T69, while T68 and all transactions
with subscripts lower than 67 completed before the checkpoint. Thus, only transactions T67, T69, . . .,
T100 need to be considered during the recovery scheme. Each of them needs to be redone if it has
completed (that is, either committed or aborted); otherwise, it was incomplete, and needs to be
undone.

Log-based recovery is a technique used in database management systems (DBMS) to recover a database
to a consistent state in the event of a failure or crash. It involves the use of transaction logs, which are
records of all the transactions performed on the database.

In log-based recovery, the DBMS uses the transaction log to reconstruct the database to a consistent
state. The transaction log contains records of all the changes made to the database, including updates,
inserts, and deletes. It also records information about each transaction, such as its start and end times.
When a failure occurs, the DBMS uses the transaction log to determine which transactions were
incomplete at the time of the failure. It then performs a series of operations to undo the incomplete
transactions and redo the completed ones. This process is called the redo/undo recovery algorithm.

The redo operation involves reapplying the changes made by completed transactions that were not yet
saved to the database at the time of the failure. This ensures that all changes are applied to the
database.

The undo operation involves undoing the changes made by incomplete transactions that were saved to
the database at the time of the failure. This restores the database to a consistent state by reversing the
effects of the incomplete transactions.

Once the redo and undo operations are completed, the DBMS can bring the database back online and
resume normal operations.

Log-based recovery is an essential feature of modern DBMSs and provides a reliable mechanism for
recovering from failures and ensuring the consistency of the database.

You might also like