KEMBAR78
Unit 4 Transaction Processing | PDF | Database Transaction | Databases
0% found this document useful (0 votes)
134 views23 pages

Unit 4 Transaction Processing

Uploaded by

mitesh.parishkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views23 pages

Unit 4 Transaction Processing

Uploaded by

mitesh.parishkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

UNIT IV- TRANSACTION PROCESSING

Unit-4
Transaction Processing

Transactions-A Database Transaction is a logical unit of processing in a DBMS which entails


one or more database access operation. One of the major uses of DBMS is to protect the user’s
data from system failures. It is done by ensuring that all the data is restored to a consistent state
when the computer is restarted after a crash. The transaction is any one execution of the user
program in a DBMS. Executing the same program multiple times will generate multiple
transactions.

Example- Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B= 800₹. This data
is brought to RAM from Hard Disk.
R(A) -- 500 // Accessed from RAM.

A = A-50 // Deducting 50₹ from A.


W(A)--450 // Updated in RAM.
R(B) -- 800 // Accessed from RAM.
B=B+50 // 50₹ is added to B's Account.
W(B) --850 // Updated in RAM.
commit // The data in RAM is taken back to Hard Disk.

The updated value of Account A = 450₹ and Account B = 850₹.


All instructions before commit come under a partially committed state and are stored in RAM.
When the commit is read the data is fully accepted and is stored in Hard Disk.
If the data is failed anywhere before commit we have to go back and start from the beginning.
We can’t continue from the same state. This is known as Roll Back.

Operations of Transaction:

Read(X): Read operation is used to read the value of X from the database and stores it in a
buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

Commit: It is used to save the work done permanently.

Rollback: It is used to undo the work done.


Transaction property-

The transaction has the four properties. These are used to maintain consistency in a database,
before and after the transaction. ACID Properties are used for maintaining the integrity of
database during transaction processing. ACID in DBMS stands
for Atomicity, Consistency, Isolation, and Durability.

Atomicity
o It states that all operations of the transaction take place at once if not, the transaction is
aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each transaction is
treated as one unit and either run to completion or is not executed at all.

Atomicity involves the following two operations:

Abort: If a transaction aborts then all the changes made are not visible.

Commit: If a transaction commits then all the changes made are visible.

Example: Let's assume that following transaction T consisting of T1 and T2. A consists of Rs
600 and B consists of Rs 300. Transfer Rs 100 from account A to account B.

T1 T2

Read(A) Read(B)
A:=A-100 Y:=Y+100
Write(A) Write(B)

After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows the
inconsistent database state. In order to ensure correctness of database state, the transaction must
be executed in entirety.

Consistency
o The integrity constraints are maintained so that the database is consistent before and after
the transaction.
o The execution of a transaction will leave a database in either its prior stable state or a new
stable state.
o The consistent property of database states that every transaction sees a consistent
database instance.
o The transaction is used to transform the database from one consistent state to another
consistent state.

For example: The total amount must be maintained before or after the transaction.

1. Total before T occurs = 600+300=900


2. Total after T occurs= 500+400=900

Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.

Isolation
o It shows that the data which is used at the time of execution of a transaction cannot be
used by the second transaction until the first one is completed.
o In isolation, if the transaction T1 is being executed and using the data item X, then that
data item can't be accessed by any other transaction T2 until the transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation property.

Durability
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system
failure occurs. These updates now become permanent and are stored in non-volatile memory.
The effects of the transaction, thus, are never lost.

States of Transaction
Active state
o The active state is the first state of every transaction. In this state, the transaction is being
executed.
o For example: Insertion or deletion or updating a record is done here. But all the records
are still not saved to the database.

Partially committed
o In the partially committed state, a transaction executes its final operation, but the data is
still not saved to the database.
o In the total mark calculation example, a final display of the total marks step is executed in
this state.

Committed

A transaction is said to be in a committed state if it executes all its operations successfully. In


this state, all the effects are now permanently saved on the database system.

Failed state
o If any of the checks made by the database recovery system fails, then the transaction is
said to be in the failed state.
o In the example of total mark calculation, if the database is not able to fire a query to fetch
the marks, then the transaction will fail to execute.

Aborted
o If any of the checks fail and the transaction has reached a failed state then the database
recovery system will make sure that the database is in its previous consistent state. If not
then it will abort or roll back the transaction to bring the database into a consistent state.
o If the transaction fails in the middle of the transaction then before executing the
transaction, all the executed transactions are rolled back to its consistent state.
o After aborting the transaction, the database recovery module will select one of the two
operations:
1. Re-start the transaction
2. Kill the transaction

Schedule

A series of operation from one transaction to another transaction is known as schedule. It is used
to preserve the order of the operation in each of the individual transaction.
1. Serial Schedule

The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its cycle,
then the next transaction is executed.

For example: Suppose there are two transactions T1 and T2 which have some operations. If it
has no interleaving of operations, then there are the following two possible outcomes:

1. Execute all the operations of T1 which was followed by all the operations of T2.
2. Execute all the operations of T1 which was followed by all the operations of T2.

o In the given (a) figure, Schedule A shows the serial schedule where T1 followed by T2.
o In the given (b) figure, Schedule B shows the serial schedule where T2 followed by T1.

2. Non-serial Schedule
o If interleaving of operations is allowed, then there will be non-serial schedule.
o It contains many possible orders in which the system can execute the individual
operations of the transactions.
o In the given figure (c) and (d), Schedule C and Schedule D are the non-serial schedules. It
has interleaving of operations.

The Non-Serial Schedule can be divided further into Serializable and Non-Serializable.
a) Serializable:The non-serial schedule is said to be in a serializable schedule only when it
is equivalent to the serial schedules, for an n number of transactions. A non-serial
schedule will be serializable if its result is equal to the result of its transactions executed
serially. These are of two types:

Conflict Serializable:

A schedule is called conflict serializable if it can be transformed into a serial schedule


by swapping non-conflicting operations. Two operations are said to be conflicting if all
conditions satisfy:
o They belong to different transactions
o They operate on the same data item
o At Least one of them is a write operation

View Serializable:

A Schedule is called view serializable if it is view equal to a serial schedule (no


overlapping transactions). A conflict schedule is a view serializable but if the
serializability contains blind writes, then the view serializable does not conflict
serializable.

Testing of Serializability-To test the serializability of a schedule, we can use Serialization


Graph or Precedence Graph. A serialization Graph is nothing but a Directed Graph of the entire
transactions of a schedule.It can be defined as a Graph G(V, E) consisting of a set of directed-
edges E = {E1, E2, E3, ..., En} and a set of vertices V = {V1, V2, V3, ...,Vn}. The set of edges
contains one of the two operations - READ, WRITE performed by a certain transaction.

Ti -> Tj, means Transaction-Ti is either performing read or write before the transaction-Tj.

If there is a cycle present in the serialized graph then the schedule is non-serializable because the
cycle resembles that one transaction is dependent on the other transaction and vice versa. It also
means that there are one or more conflicting pairs of operations in the transactions. On the other
hand, no-cycle means that the non-serial schedule is serializable.
What is a conflicting pair in transactions?

Two operations inside a schedule are called conflicting if they meet these three conditions:

1. They belong to two different transactions.


2. They are working on the same data piece.
3. One of them is performing the WRITE operation.

To conclude, let’s take two operations on data: "a". The conflicting pairs are:

1. READ(a) - WRITE(a)
2. WRITE(a) - WRITE(a)
3. WRITE(a) - READ(a)

Problem-01:
Check whether the given schedule S is conflict serializable or not-
S : R1(A) , R2(A) , R1(B) , R2(B) , R3(B) , W1(A) , W2(B)
Solution-

Step-01: List all the conflicting operations and determine the dependency between the
transactions-
 R2(A) , W1(A) (T2 → T1)
 R1(B) , W2(B) (T1 → T2)
 R3(B) , W2(B) (T3 → T2)
Step-02:

Draw the precedence graph-

 Clearly, there exists a cycle in the precedence graph.


 Therefore, the given schedule S is not conflict serializable.
Problem-02:

Check whether the given schedule S is conflict serializable or not. If yes, then determine all the
possible serialized schedules-

Solution-

Checking Whether S is Conflict Serializable Or Not-

Step-01:

List all the conflicting operations and determine the dependency between the transactions-
 R4(A) , W2(A) (T4 → T2)
 R3(A) , W2(A) (T3 → T2)
 W1(B) , R3(B) (T1 → T3)
 W1(B) , W2(B) (T1 → T2)
 R3(B) , W2(B) (T3 → T2)

Step-02:

Draw the precedence graph-


 Clearly, there exists no cycle in the precedence graph.
 Therefore, the given schedule S is conflict serializable.

After performing the incoming edge rule, the possible serialized schedules are-
1. T1 → T3 → T4 → T2
2. T1 → T4 → T3 → T2
3. T4 → T1 → T3 → T2

Conflict Serializable Schedule


o A schedule is called conflict serializability if after swapping of non-conflicting
operations, it can transform into a serial schedule.
o The schedule will be a conflict serializable if it is conflict equivalent to a serial schedule.

Conflicting Operations

The two operations become conflicting if all conditions satisfy:

1. Both belong to separate transactions.


2. They have the same data item.
3. They contain at least one write operation.

That means –W-R, R-W,W-W operations are conflict operations.

Conflict Equivalent

In the conflict equivalent, one can be transformed to another by swapping non-conflicting


operations. In the given example, S2 is conflict equivalent to S1 (S1 can be converted to S2 by
swapping non-conflicting operations).
Two schedules are said to be conflict equivalent if and only if:

1. They contain the same set of the transaction.


2. If each pair of conflict operations are ordered in the same way.

After swapping of non-conflict operations, the schedule S1 becomes:

Since, S1 is conflict serializable.

View Serializability: A Schedule is called view serializable if it is view equal to a serial


schedule (no overlapping transactions).

Condition of schedules to View-equivalent-


Two schedules S1 and S2 are said to be view-equivalent if below conditions are satisfied :

1. Initial Read

An initial read of both schedules must be the same. Suppose two schedule S1 and S2. In schedule
S1, if a transaction T1 is reading the data item A, then in S2, transaction T1 should also read A.

Above two schedules are view equivalent because Initial read operation in S1 is done by T1 and
in S2 it is also done by T1.

2. Updated Read

In schedule S1, if Ti is reading A which is updated by Tj then in S2 also, Ti should read A which
is updated by Tj.

Above two schedules are not view equal because, in S1, T3 is reading A updated by T2 and in
S2, T3 is reading A updated by T1.

3. Final Write

A final write must be the same between both the schedules. In schedule S1, if a transaction T1
updates A at last then in S2, final writes operations should also be done by T1.
Above two schedules is view equal because Final write operation in S1 is done by T3 and in S2,
the final write operation is also done by T3.

Problem-Check whether the given schedule S is view serializable or not-

Solution-

 We know, if a schedule is conflict serializable, then it is surely view serializable.


 So, let us check whether the given schedule is conflict serializable or not.
Checking Whether S is Conflict Serializable Or Not-
Draw the precedence graph-
 Clearly, there exists a cycle in the precedence graph.
 Therefore, the given schedule S is not conflict serializable.
Now,
 To check whether S is view serializable or not, let us use another method.
 Let us derive the dependencies and then draw a dependency graph.

Drawing a Dependency Graph-

 T1 firstly reads A and T3 firstly updates A.


 So, T1 must execute before T3.
 Thus, we get the dependency T1 → T3.
 Final updation on A is made by the transaction T1.
 So, T1 must execute after all other transactions.
 Thus, we get the dependency (T2, T3) → T1.
 There exists no write-read sequence.
Now, let us draw a dependency graph using these dependencies-

 Clearly, there exists a cycle in the dependency graph.


 Thus, we conclude that the given schedule S is not view serializable.

Non-Serializability in DBMS

A non-serial schedule which is not serializable is called as non-serializable schedule. Non-


serializable schedules may/may not be consistent or recoverable. Non-serializable schedule is
divided into types:

1. Recoverable schedule
2. Non-recoverable schedule

Recoverable Schedule

A schedule is recoverable if each transaction commits only after all the transactions from which
it has read has committed. In other words, if some transaction Ty reads value that has been
updated/written by some other transaction Tx, then the commit of Ty must occur after the
commit of Tx.
Schedule shown above is Recoverable since T1 commits before T2, that makes the value read
by T2 correct.

Recoverable schedules are further categorised into two types:

1. Cascading Schedule
2. Cascadeless Schedule

Cascading Schedule

If in a schedule, several other dependent transactions are forced to rollback/abort because of


the failure of one transaction, then such a schedule is called as Cascading Schedule or Cascading
Rollback or Cascading Abort. It simply leads to the wastage of CPU time.

Here, Transaction T2 depends on transaction T1 and transaction T3 depends on transaction T2.


Thus, in this schedule, the failure of transaction T1 will cause transaction T2 to rollback and
similar case for transaction T3. Therefore, it is a cascading schedule. If the
transactions T2 and T3 would have committed before the failure of transaction T1, then the
schedule would have been irrecoverable.
Cascadeless Schedule

If in a schedule, a transaction is not allowed to read a data item, until and unless the last
transaction that has written is committed/aborted, then such a schedule is called as Cascadeless
Schedule. It avoids cascading roll back and thus saves CPU time. To prevent cascading
rollbacks, it disallows a transaction from reading uncommitted changes from another transaction
in the same schedule. In other words, if some transaction Ty wants to read value that has been
updated or written by some other transaction Tx, then only after the commit of Tx, the commit
of Ty must read it. Look at the example shown below.

Here, the updated value of X is read by transaction T2 only after the commit of transaction T1.
Hence, the schedule is Cascadeless schedule.

Non-Recoverable Schedule
If a transaction reads the value of an operation from an uncommitted transaction
and commits before the transaction from where it has read the value, then such a schedule is
called Non-Recoverable schedule. A non recoverable schedule means when there is a system
failure, we may not be able to recover to a consistent database state. If the commit operation
of Ti doesn't occur before the commit operation of Tj, it is non-recoverable.

Consider the following schedule involving two transactions T1 and T2. T2 read the value of A
written by T1, and committed. T1 might later abort/commit, therefore the value read by T2 is
wrong, but since T2 committed, this schedule is non-recoverable.
Transaction:
 A transaction is a set of logically related operations.
 Now that we understand what is transaction, we should understand what are the problems
associated with it.
 The main problem that can happen during a transaction is that the transaction can fail before
finishing the all the operations in the set. This can happen due to power failure system crash
etc.
 This is a serious problem that can leave database in an inconsistent state. Assume that
transaction fail after third operation (see the example above) then the amount would be
deducted from your account but your friend will not receive it.

SERIALIZABILITY IN DBMS
 Some non-serial schedules may lead to inconsistency of the database.
 Serializability is a concept that helps to identify which non-serial schedules are correct and
will maintain the consistency of the database.

Recoverability of Schedule:
Sometimes a transaction may not execute completely due to a software issue, system crash or
hardware failure. In that case, the failed transaction has to be rollback. But some other
transaction may also have used value produced by the failed transaction. So we also have to
rollback those transactions.

TRANSACTION ISOLATION LEVELS IN DBMS


The SQL standard defines four isolation levels :

1. Read Uncommitted – Read Uncommitted is the lowest isolation level. In this level, one
transaction may read not yet committed changes made by other transaction, thereby allowing
dirty reads. In this level, transactions are not isolated from each other.
2. Read Committed – This isolation level guarantees that any data read is committed at the
moment it is read.
Thus it does not allow dirty read. The transactions hold a read or write lock on the current row, and thus
prevent other transactions from reading, updating or deleting it.
3. Repeatable Read – This is the most restrictive isolation level. The transaction holds read locks on all rows
it references and writes locks on all rows it inserts, updates, or deletes. Since other transaction cannot read, update
or delete these rows, consequently itavoids non-repeatable read.
4. Serializable – This is the highest isolation level. A serializable execution is guaranteed to be serializable.
Serializable execution is defined to be an execution of operations in which concurrently executing transactions
appears to be serially executing.

FAILURE CLASSIFICATION
To find that where the problem has occurred, we generalize a failure into the followingcategories:
1. Transaction failure
2. System crash
3. Disk failure

1. Transaction failure
The transaction failure occurs when it fails to execute or when it reaches a point from where it can't go any
further. If a few transaction or process is hurt, then this is called as transaction failure.
Reasons for a transaction failure could be -
1. Logical errors: If a transaction cannot complete due to some code error or an internal error condition, then
the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active transaction because the database system
is not able to execute it. For example, the system aborts an active transaction, in case of deadlock or resource
unavailability.

2. System Crash
System failure can occur due to power failure or other hardware or software failure. Example: Operating
system error.
Fail-stop assumption: In the system crash, non-volatile storage is assumed not to be corrupted.

3. Disk Failure
o It occurs where hard-disk drives or storage drives used to fail frequently. It was a common problem in the
early days of technology evolution.
o Disk failure occurs due to the formation of bad sectors, disk head crash, and unreachability to the disk or any
other failure, which destroy all or part of diskstorage.

CONCURRENT EXECUTION OF TRANSACTION


In the transaction process, a system usually allows executing more than one transactionsimultaneously. This
process is called a concurrent execution.

Advantages of concurrent execution of a transaction


1. Decrease waiting time or turnaround time.
2. Improve response time
3. Increased throughput or resource utilization.
Problems with Concurrent Execution
In a database transaction, the two main operations are READ and WRITE operations. So, there is a need to
manage these two operations in the concurrent execution of the transactionsas if these operations are not
performed in an interleaved manner, and the data may become inconsistent. So, the following problems occur
with the Concurrent Execution of the operations:
1: Lost Update Problems (W - W Conflict)
2. Dirty Read Problems (W-R Conflict)
3. Unrepeatable Read Problem (W-R Conflict)

1. Lost update problem (Write – Write conflict)


This type of problem occurs when two transactions in database access the same data item and have their
operations in an interleaved manner that makes the value of some database item incorrect.
If there are two transactions T1 and T2 accessing the same data item value and then update it, then the second
record overwrites the first record.
Example: Let’s take the value of A is 100
Time Transaction T1 Transaction T2
t1 Read(A)
t2 A=A-50
t3 Read(A)
t4 A=A+50
t5 Write(A)
t6 Write(A)
Here,
 At t1 time, T1 transaction reads the value of A i.e., 100.
 At t2 time, T1 transaction deducts the value of A by 50.
 At t3 time, T2 transactions read the value of A i.e., 100.
 At t4 time, T2 transaction adds the value of A by 150.
 At t5 time, T1 transaction writes the value of A data item on the basis of value seen at time t2i.e., 50.
 At t6 time, T2 transaction writes the value of A based on value seen at time t4 i.e., 150.
 So at time T6, the update of Transaction T1 is lost because Transaction T2 overwrites the value of A
without looking at its current value.
 Such type of problem is known as the Lost Update Problem.

Dirty read problem (W-R conflict)

This type of problem occurs when one transaction T1 updates a data item of the database, and then that
transaction fails due to some reason, but its updates are accessed by some other transaction.
Example: Let’s take the value of A is 100.

Time Transaction T1 Transaction T2


t1 Read(A)
t2 A=A+20
t3 Write(A)
t4 Read(A)
t5 A=A+30
t6 Write(A)
t7 Write(B)
Here,
 At t1 time, T1 transaction reads the value of A i.e., 100.
 At t2 time, T1 transaction adds the value of A by 20.
 At t3 time, T1transaction writes the value of A (120) in the database.
 At t4 time, T2 transactions read the value of A data item i.e., 120.
 At t5 time, T2 transaction adds the value of A data item by 30.
 At t6 time, T2transaction writes the value of A (150) in the database.
 At t7 time, a T1 transaction fails due to power failure then it is rollback according to atomicity property
of transaction (either all or none).
 So, transaction T2 at t4 time contains a value which has not been committed in the database. The value
read by the transaction T2 is known as a dirty read.

Unrepeatable read (R-W Conflict)


It is also known as an inconsistent retrieval problem. If a transaction T 1 reads a value of data item twice and the
data item is changed by another transaction T 2 in between the two read operation. Hence T1 access two
different values for its two read operation of the same data item.
Example: Let’s take the value of A is 100

Time Transaction T1 Transaction T2


t1 Read(A)
t2 Read(A)
t3 A=A+30
t4 Write(A)
t5 Read(A)
Here,
 At t1 time, T1 transaction reads the value of A i.e., 100.
 At t2 time, T2transaction reads the value of A i.e., 100.
 At t3 time, T2 transaction adds the value of A data item by 30.
 At t4 time, T2 transaction writes the value of A (130) in the database.
 Transaction T2 updates the value of A. Thus, when another read statement is performed by transaction
T1, it accesses the new value of A, which was updated by T2. Such type of conflict is known as R-W
conflict.

You might also like