Transactions in database systems and functions

2
Topics to be covered
• Transaction concepts
• Properties of transactions
• Serializability of transactions
• Testing for serializability
• System recovery
• Two-phase commit protocol
• Recovery and atomicity
• Log-based recovery
• Concurrent executions of transactions and related problems
• Locking mechanism
• Solution to concurrency related problems
• Deadlock
• Two-phase locking protocol
• Isolation
• Intent locking

3
What is transaction?
 A transaction is a sequence of operations performed as a single logical unit of work.
 A transaction is a logical unit of work that contains one or more SQL statements.
 Example of transaction
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
Transaction
Operations
Works as a single
logical unit

4
Example:
 Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This small
transaction contains several low-level tasks:
• Open_Account(X)
• Old_Balance = X.balance
• New_Balance = Old_Balance- 800
• X.balance = New_Balance
• Close_Account(X)
X's Account
Y's Account
• Open_Account(Y)
• Old_Balance = Y.balance
• New_Balance = Old_Balance + 800
• Y.balance = New_Balance
• Close_Account(Y)
• Open_Account(X)
• Old_Balance = X.balance
• New_Balance1 = Old_Balance- 800
• Open_Account(Y)
• Old_Balance = Y.balance
• New_Balance2 = Old_Balance + 800
• Y.balance = New_Balance1
• X.balance = New_Balance2

5
Operations of Transaction
Following are the main operations of transaction:
 Read(X): Read operation is used to read the value of X from the database and stores it in a buffer in main memory.
 Write(X): Write operation is used to write the value back to the database from the buffer.
 Commit: It is used to save the work done permanently.
 Rollback: It is used to undo the work done.
Let's assume the value of X before starting of the transaction is 4000.
1. The first operation reads X's value from database and stores it in a buffer.
2. The second operation will decrease the value of X by 500. So buffer will contain 3500.
3. The third operation will write the buffer's value to the database. So X's final value will be 3500.
 But it may be possible that because of the failure of hardware, software or power, etc. that transaction may fail before finished all the operations in the set.
 For example: If in the above transaction, the debit transaction fails after executing operation 2 then X's value will remain 4000 in the database which is not
acceptable by the bank.
Let's take an example to debit transaction from an account which consists of following operations:
• 1. R(X);
• 2. X = X- 500;
• 3. W(X);

7
ACID properties of transaction
 Atomicity (Either transaction execute 0% or 100%)
 Consistency (database must remain in a consistent state after any transaction)
 Isolation (Intermediate transaction results must be hidden from other concurrently
executed transactions)
 Durability (Once a transaction completed successfully, the changes it has made into the
database should be permanent)

8
1. Atomicity
 This property states that a transaction must be treated as an
atomic unit, that is, either all of its operations are executed or
none.
 Either transaction execute 0% or 100%.
 For example, consider a transaction to transfer Rs. 50 from
account A to account B.
 In this transaction, if Rs. 50 is deducted from account A then it
must be added to account B.
– If transaction fails then the amount will be deducted from A but not added to
B. This shows the inconsistent database state.
– In order to ensure correctness of database state, the transaction must be
executed in entirety.
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
0%
100%
FAIL

9
2. Consistency
 The database must remain in a consistent state after any
transaction.
 If the database was in a consistent state before the execution of a
transaction, it must remain consistent after the execution of the
transaction as well.
 In our example, total of A and B must remain same before and
after the execution of transaction.
A=500, B=500
A+B=1000
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
A=450, B=550
A+B=1000

10
3. Isolation
 Changes occurring in a particular transaction will not be visible to
any other transaction until it has been committed.
 Intermediate transaction results must be hidden from other
concurrently executed transactions.
 In our example once our transaction starts from first step (step 1) its
result should not be access by any other transaction until last step
(step 6) is completed.
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)

11
4. Durability
 After a transaction completes successfully, the changes it has made to the
database persist (permanent), even if there are system failures.
 Once our transaction completed up to last step (step 6) its result must be stored
permanently. It should not be removed if system fails.
A=500, B=500
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
A=450, B=550

12
read (A)
A = A – 50
write (A)
read (B)
B = B + 50
write (B)
Commit
Transaction State Diagram State Transition Diagram
Active
Partial
Committed
Failed
Committed
Aborted
• This is the initial state.
• The transaction stays in this state
while it is executing.
End
• When a transaction executes its final operation, it is said
to be in a partially committed state.
• Discover that normal execution can no longer proceed.
• Once a transaction cannot be completed, any changes
that it made must be undone rolling it back.
• The state after the transaction has been rolled back and
the database has been restored to its state prior to the
start of the transaction.
• The transaction enters in this state after successful
completion of the transaction.
• We cannot abort or rollback a committed transaction.

13
1. Active
• This is the initial state.
• The transaction stays in this state while it is executing.
2. Partial Committed
• When a transaction executes its final operation/ instruction, it is said to be in a partially committed
state.

14
3. Failed
• Discover that normal execution can no longer proceed.
• Once a transaction cannot be completed, any changes that it made must be undone rolling it back.
4. Aborted
• The state after the transaction has been rolled back and the database has been restored to its state
prior to the start of the transaction.
After aborting the transaction, the database recovery module will select one of the two operations:
• Re-start the transaction
• Kill the transaction

15
5. Committed
• A transaction is said to be in a committed state if it has executed all of its operations successfully. In this state, all the
effects are now permanently saved on the database system.
• In this state transactions may release all of its acquired resources.
• After a transaction has entered the committed state, it is not possible to roll back the transaction.
In other words, it is not possible to undo the changes that has been made by the transaction.
This is because the system is updated into a new consistent state.
The only way to undo the changes is by carrying out another transaction called as compensating transaction that performs the reverse
operations.
6. Terminated State-
• After entering the committed state or aborted state,
the transaction finally enters into a terminated
state where its life cycle finally comes to an end.

16
Concurrency Problems in DBMS-
• When multiple transactions execute concurrently in an uncontrolled or unrestricted
manner, then it might lead to several problems.
• Such problems are called as concurrency problems.

17
1. Dirty Read Problem-
Reading the data written by an uncommitted transaction is called as dirty read
This read is called as dirty read because-
• There is always a chance that the uncommitted transaction
might roll back later.
• Thus, uncommitted transaction might make other
transactions read a value that does not even exist.
• This leads to inconsistency of the database.
NOTE-
• Dirty read does not lead to inconsistency always.
• It becomes problematic only when the uncommitted transaction fails and roll
backs later due to some reason.
• T2 reads the dirty value of A written by the uncommitted
transaction T1.
• T1 fails in later stages and roll backs.
• Thus, the value that T2 read now stands to be incorrect.
• Therefore, database becomes inconsistent.
Here,
1.T1 reads the value of A.
2.T1 updates the value of A in the buffer.
3.T2 reads the value of A from the buffer.
4.T2 writes the updated the value of A.
5.T2 commits.
6.T1 fails in later stages and rolls back.

18
2. Unrepeatable Read Problem-
 This problem occurs when a transaction gets to read unrepeated i.e. different values of
the same variable in its different read operations even when it has not updated its value.
Here,
1.T1 reads the value of X (= 10 say).
2.T2 reads the value of X (= 10).
3.T1 updates the value of X (from 10 to 15 say) in the buffer.
4.T2 again reads the value of X (but = 15).
In this example,
• T2 gets to read a different value of X in its second reading.
• T2 wonders how the value of X got changed because according to it, it is running in isolation.
*For T2 there is a violation of Isolation

19
3. Lost Update Problem-
 This problem occurs when multiple transactions execute concurrently and updates from
one or more transactions get lost.
Here,
1.T1 reads the value of A (= 10 say).
2.T2 updates the value to A (= 15 say) in the buffer.
3.T2 does blind write A = 25 (write without read) in the buffer.
4.T2 commits.
5.When T1 commits, it writes A = 25 in the database.
In this example,
• T1 writes the over written value of X in the database.
• Thus, update from T1 gets lost.
NOTE-
• This problem occurs whenever there is a write-write conflict.
• In write-write conflict, there are two writes one by each
transaction on the same data item without any read in the
middle.

20
4. Phantom Read Problem-
 This problem occurs when a transaction reads some variable from the buffer and when it
reads the same variable later, it finds that the variable does not exist
Here,
1.T1 reads X.
2.T2 reads X.
3.T1 deletes X. from the schema
4.T2 tries reading X but does not find it.
In this example,
• T2 finds that there does not exist any variable X when it tries
reading X again.
• T2 wonders who deleted the variable X because according
to it, it is running in isolation
Avoiding Concurrency Problems-
• To ensure consistency of the database, it is very important to
prevent the occurrence of above problems.
• Concurrency Control Protocols help to prevent the occurrence of
above problems and maintain the consistency of the database.
So we need concurrency protocols to deal with these different problems

21
What is schedule?
 A schedule is a process of grouping the transactions into one and executing them in a
predefined order.
 A schedule is the chronological (sequential) order in which instructions are executed in a
system.
 A schedule is required in a database because when some transactions execute in parallel,
they may affect the result of the transaction.
 Means if one transaction is updating the values which the other transaction is accessing,
then the order of these two transactions will change the result of another transaction.
 Hence a schedule is created to execute the transactions.

22
The order in which the operations of multiple
transactions appear for execution is called as
a schedule

23
Example of schedule
T1 T2
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
Read (A)
temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
A=B=1000
Read (1000)
A = 1000 - 50
Write (950)
Read (1000)
B = 1000 + 50
Write (1050)
Commit
Read (950)
temp = 950 * 0.1
A = 950 - 95
Write (855)
Read (1050)
B = 1050 + 95
Write (1145)
Commit

24
Example of schedule
T1 T2
Read (A)
Temp = A * 0.1
A = A - temp
Write (A)
Read (B)
B = B + temp
Write (B)
Commit
Read (A)
A = A - 50
Write (A)
Read (B)
B = B + 50
Write (B)
Commit
A=B=1000
Read (1000)
Temp = 1000 * 0.1
A = 1000 - 100
Write (900)
Read (1000)
B = 1000 + 100
Write (1100)
Commit
Read (900)
A = 900 - 50
Write (850)
Read (1100)
B = 1100 + 50
Write (1150)
Commit

25
Serial Schedules-
In serial schedules,
• All the transactions execute serially one after the other.
• When one transaction executes, no other transaction is allowed to execute.
Characteristics- Serial schedules are always-
• Consistent
• Recoverable
• Cascadeless
• Strict
Serial Schedules
T1,T2
T2,T1

26
Non-Serial Schedules-
In non-serial schedules,
• Multiple transactions execute concurrently.
• Operations of all the transactions are inter leaved or mixed with each other.
Characteristics-
Non-serial schedules are NOT always-
• Consistent
• Recoverable
• Cascadeless
• Strict

27
Finding Number Of Schedules-
Consider there are n number of transactions T1, T2, T3 …. , Tn with N1, N2, N3 …. , Nn number of operations
respectively.
Total Number of Schedules-
Total number of possible schedules (serial + non-serial) is given by-
Total Number of Serial Schedules-
Total number of serial schedules
= Number of different ways of arranging n transactions
= n!
Total Number of Non-Serial Schedules-
Total number of non-serial schedules
= Total number of schedules – Total number of serial schedules

28
Problem-
Consider there are three transactions with 2, 3, 4 operations respectively, find-
1.How many total number of schedules are possible?
2.How many total number of serial schedules are possible?
3.How many total number of non-serial schedules are possible?
Solution-
Total Number of Schedules-
Using the above formula, we have-
Total Number of Serial Schedules-
Total number of serial schedules
= Number of different ways of arranging 3 transactions
= 3!
= 6
Total Number of Non-Serial Schedules-
Total number of non-serial schedules
= Total number of schedules – Total number of serial schedules
= 1260 – 6
= 1254

29
Serial Schedules Serializable Schedules
No concurrency is allowed.
Thus, all the transactions necessarily execute serially one after
the other.
Concurrency is allowed.
Thus, multiple transactions can execute concurrently.
Serial schedules lead to less resource utilization and CPU
throughput.
Serializable schedules improve both resource utilization and CPU
throughput.
Serial Schedules are less efficient as compared to serializable
schedules.
(due to above reason)
Serializable Schedules are always better than serial schedules.
(due to above reason)

30
Serializability in DBMS-
• Some non-serial schedules may lead to inconsistency of the database.
• Serializability is a concept that helps to identify which non-serial schedules are correct and will maintain the
consistency of the database.
Serializable Schedules-
If a given non-serial schedule of ‘n’ transactions is equivalent to some serial schedule of ‘n’ transactions, then it is
called as a serializable schedule.
Characteristics-
Serializable schedules behave exactly same as serial schedules.
Thus, serializable schedules are always-
• Consistent
• Recoverable
• Casacadeless
• Strict

31
Types of Serializability-
Serializability is mainly of two types-
 Serializability is a concept that helps to identify which non-serial schedules are correct and
will maintain the consistency of the database.

32
Conflict Serializability-
If a given non-serial schedule can be converted into a serial schedule by swapping its non-conflicting operations,
then it is called as a conflict serializable schedule.
Conflicting Operations-
Two operations are called as conflicting operations if all the following conditions hold true for them-
• Both the operations belong to different transactions
• Both the operations are on the same data item
• At least one of the two operations is a write operation
In this schedule,
• W1 (A) and R2 (A) are called as conflicting operations.
• This is because all the above conditions hold true for them.
R1(X) W2(X)
W1(X) R2(X)
W1(X) W2(X)

33
Checking Whether a Schedule is Conflict Serializable Or Not-
For schedule S, we construct a graph known as precedence graph. This graph has a pair G = (V, E), where V consists a
set of vertices, and E consists a set of edges. The set of vertices is used to contain all the transactions participating in
the schedule. The set of edges is used to contain all edges Ti ->Tj for which one of the three conditions holds:
1. Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).
2. Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).
3. Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).
*create an edge on conflicting operation of different transactions on same data item
Step-01: Find and list all the conflicting operations.
Step-02: Start creating a precedence graph by drawing one node for each transaction.
Step-03: Draw an edge for each conflict pair such that if Ti (Q) and Tj (Q) forms a conflict pair then draw
an edge from Ti to Tj. This ensures that Ti gets executed before Tj.
Step-04: Check if there is any cycle formed in the graph.
• If there is no cycle found, then the schedule is conflict serializable otherwise not.
NOTE: By performing the Topological Sort of the Directed Acyclic Graph so obtained, the corresponding
serial schedule(s) can be found. Such schedules can be more than 1.

34
PRACTICE PROBLEMS BASED ON CONFLICT SERIALIZABILITY-
Problem-01: Check whether the given schedule S is conflict serializable or not-
S : R1
(A) , R2
(A) , R1
(B) , R2
(B) , R3
(B) , W1
(A) , W2
(B)
Solution-
Step-01: List all the conflicting operations and determine the dependency between the transactions-
• R2
(A) , W1
(A) (T2
→ T1
)
• R1
(B) , W2
(B) (T1
→ T2
)
• R3
(B) , W2
(B) (T3
→ T2
)
Step-02: Draw the precedence graph-
• Clearly, there exists a cycle in the precedence graph.
• Therefore, the given schedule S is not conflict serializable.

35
Problem-02: Check whether the given schedule S(with commit) is conflict serializable and recoverable or not-
T1 T2 T3 T4
R(X)
W(X)
commit
W(X)
commit
W(Y)
R(Z)
commit
R(X)
R(Y)
commit
Step-01: List all the conflicting operations and determine the
dependency between the transactions-
• R2(X) , W3(X) (T2 → T3)
• Clearly, there exists no cycle in the precedence graph.
• Therefore, the given schedule S is conflict serializable.

36
Problem-03: Check whether the given schedule S is conflict serializable and recoverable or not-
T1 T2 T3 T4
R(X)
W(X)
W(X)
W(Y)
R(Z)
R(X)
R(Y)
Step-01: List all the conflicting operations and determine the dependency between
the transactions-
• R2(X) , W3(X) (T2 → T3)
• R2(X) , W1(X) (T2 → T1)
• W3(X) , W1(X) (T3 → T1)
• W3(X) , R4(X) (T3 → T4)
• W1(X) , R4(X) (T1 → T4)
• W2(Y) , R4(Y) (T2 → T4)

37
Problem-03: Check whether the given schedule S is conflict serializable and recoverable or not-
T1 T2 T3 T4
R(X)
W(X)
W(X)
W(Y)
R(Z)
R(X)
R(Y)
The topological orderings is used in the Topological Sort, to know the order
of serialization.
T2  T3  T1  T4

38
Problem-04: Check whether the given schedule S is conflict serializable or not. If yes, then determine all the possible
serialized schedules-
Step-01: List all the conflicting operations and determine the dependency
between the transactions-
• R4(A) , W2(A) (T4 → T2)
• R3(A) , W2(A) (T3 → T2)
• W1(B) , R3(B) (T1 → T3)
• W1(B) , W2(B) (T1 → T2)
• R3(B) , W2(B) (T3 → T2)
After performing the topological sort, the possible serialized schedules are-
1.T1 → T3 → T4 → T2
2.T1 → T4 → T3 → T2
3.T4 → T1 → T3 → T2

39
Problem-05: Determine all the possible serialized schedules for the given schedule-
Step-01:
List all the conflicting operations and determine the
dependency between the transactions-
• R1(A) , W2(A) (T1 → T2)
• R2(A) , W1(A) (T2 → T1)
• W2(A) , W1(A) (T2 → T1)
• R2(B) , W1(B) (T2 → T1)
• R1(B) , W2(B) (T1 → T2)
• W1(B) , W2(B) (T1 → T2)
Step-02:
Draw the precedence graph-
• Clearly, there exists a cycle in the precedence graph.
• Therefore, the given schedule S is not conflict serializable.
• Thus, Number of possible serialized schedules = 0.

40
Conflict Equivalent
In the conflict equivalent, one can be transformed to another by swapping non-conflicting operations. In the
given example, S2 is conflict equivalent to S1 (S1 can be converted to S2 by swapping non-conflicting
operations).
Two schedules are said to be conflict equivalent if and only if:
1. They contain the same set of the transaction.
2. If each pair of conflict operations are ordered in the same way.
Example:
Schedule S2 is a serial schedule because, in this, all operations of T1 are performed before starting any operation of T2.
Schedule S1 can be transformed into a serial schedule by swapping non-conflicting operations of S1.
So, S1 is conflict serializable.
S1=S2

41
Inconsistent
Consistent
View S.
Conflict S.
• Conflict serializability is easy to achieve
but view serializability is hard to achieve
• Every conflict serializable is view
serializable but the reverse is not true.
• It is much easy to test conflict serializability
but expensive to test view serializability.
• Most of the concurrency control schemes used
in practice are based on conflict serializability.

42
View Serializability-
View Equivalent Schedules-
Consider two schedules S1 and S2 each consisting of two transactions T1 and T2.
Schedules S1 and S2 are called view equivalent if the following three conditions hold true for them-
Condition-01: For each data item X, if transaction Ti
reads X from the database initially in schedule S1, then in
schedule S2 also, Ti
must perform the initial read of X from the database.
Condition-02: If transaction Ti reads a data item that has been updated by the transaction Tj in schedule S1, then in
schedule S2 also, transaction Ti must read the same data item that has been updated by the transaction Tj.
Condition-03: For each data item X, if X has been updated at last by transaction Ti in schedule S1, then in
schedule S2 also, X must be updated at last by transaction Ti
If a given schedule is found to be view equivalent to some serial schedule,
then it is called as a view serializable schedule.
1. Initial readers must be same for all the data items
2. Write-read sequence must be same.
3. Final writers must be same for all the data items

43
Problem-01: Check whether the given schedule S1 is view serializable or not-
Schedule is view serializable.
T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
R(B)
W(B)
T1 T2
R(A)
W(A)
R(B)
W(B)
R(A)
W(A)
R(B)
W(B)
S1 S2
T1 T2
R(A)
W(A)
R(B)
W(B)
R(A)
W(A)
R(B)
W(B)
S3
1. T1(A), T1(B)
2. T1->T2(A),T1->T2(B)
3. T2(A), T2(B)
1. T1(A), T1(B)
2. T1->T2(A),T1->T2(B)
3. T2(A), T2(B)
1. T2(A), T2(B)
2. T2->T1(A),T2->T1(B)
3. T1(A), T1(B)

44
Problem-02: Check whether the given schedule S is view serializable or not-
Schedule is view serializable.
1. T1(A)
2. There is no write read sequence
3. T4(B)
So we consider a serial schedule S’=T1, T2, T3, T4
1. T1(A)
2. There is no write read sequence
3. T4(B)

45
Problem-03: Check whether the given schedule S is view serializable or not-
• T1 firstly reads A and T3 firstly updates A.
• So, T1 must execute First.
• Final updation on A is made by the transaction T1.
• So, T1 must execute after all other transactions.
• Thus, we get the dependency (T2, T3) → T1.
• There exists no write-read sequence.
• Thus, we conclude that the given schedule S is not view
serializable.
1. T1(A)
2. No producer consumer
3. T1(A)
Also not a conflict serializable

46
Checking Whether a Schedule is View Serializable Or Not-
Thumb Rules
• All conflict serializable schedules are view serializable.
• All view serializable schedules may or may not be conflict serializable.
Method-01: If the given schedule is conflict serializable, then it is surely view serializable.
Method-02: Check if there exists any blind write operation. (Writing without reading is called as a blind write).
• If there does not exist any blind write, then the schedule is surely not view serializable.
• If there exists any blind write, then the schedule may or may not be view serializable. Go and check using other methods.
Method-03: In this method, try finding a view equivalent serial schedule.
• By using the defined three conditions.

47
Problem-03: Check whether the given schedule S is view serializable or not
T1 T2 T3
R(A)
W(A)
W(A)
W(A)
1. First we will check conflict serializability
• R1(A) W2(A)
• R1(A) W3(A)
• W2(A) W3(A)
• W2(A) W1(A)
• W1(A) W3(A)
Is a view serializable
1. T1(A)
3. T3(A)
Consider a serial schedule S’=T1 T2 T3
1. T1(A)
3. T3(A)
Not Conflict serializable

48
Irrecoverable Schedules-
If in a schedule,
• A transaction performs a dirty read operation from an uncommitted transaction
• And commits before the transaction from which it has read the value
then such a schedule is known as an Irrecoverable Schedule.
Example-
Consider the following schedule-
Here
• T2 performs a dirty read operation.
• T2 commits before T1.
• T1 fails later and roll backs.
• The value that T2 read now stands to be incorrect.
• T2 can not recover since it has already committed.

49
If in a schedule,
• A transaction performs a dirty read operation from an uncommitted transaction
• And its commit operation is delayed till the uncommitted transaction either commits or roll backs
then such a schedule is known as a Recoverable Schedule.
Here,
• The commit operation of the transaction that performs the dirty read is delayed.
• This ensures that it still has a chance to recover if the uncommitted transaction fails later.
Example- Consider the following schedule-
Here,
• T2 performs a dirty read operation.
• The commit operation of T2 is delayed till T1 commits or roll backs.
• T1 commits later.
• T2 is now allowed to commit.
• In case, T1 would have failed, T2 has a chance to recover by rolling
back.
Recoverable Schedules-

50
A recoverable schedule may be any one of these kinds-
1.Cascading Schedule
2.Cascadeless Schedule
3.Strict Schedule
Types of Recoverable Schedules-

51
• If in a schedule, failure of one transaction causes several other dependent transactions to rollback or
abort, then such a schedule is called as a Cascading Schedule or Cascading Rollback or Cascading Abort.
• It simply leads to the wastage of CPU time.
Here,
•Transaction T2 depends on transaction T1.
In this schedule,
•The failure of transaction T1 causes the transaction T2 to rollback.
•The rollback of transaction T2 causes the transaction T3 to rollback.
•The rollback of transaction T3 causes the transaction T4 to rollback.
Such a rollback is called as a Cascading Rollback
If the transactions T2, T3 and T4 would have committed before the
failure of transaction T1, then the schedule would have been
irrecoverable.
Cascading Schedule-

52
If in a schedule, a transaction is not allowed to read a data item until the last transaction that has written it is committed
or aborted, then such a schedule is called as a Cascadeless Schedule.
In other words,
• Cascadeless schedule allows only committed read operations.
• Therefore, it avoids cascading roll back and thus saves CPU time
NOTE-
• Cascadeless schedule allows only committed read operations.
• However, it allows uncommitted write operations.
Cascadeless Schedule-

53
If in a schedule, a transaction is neither allowed to read nor write a data item until the last transaction
that has written it is committed or aborted, then such a schedule is called as a Strict Schedule.
In other words,
• Strict schedule allows only committed read and write operations.
• Clearly, strict schedule implements more restrictions than cascadeless schedule.
Strict Schedule-

54
• Strict schedules are more strict than cascadeless schedules.
• All strict schedules are cascadeless schedules.
• All cascadeless schedules are not strict schedules.
Remember-

55
Two phase commit protocol
 Two phase commit protocol ensures that all participants perform the same action (either
to commit or to rollback a transaction).
 It is designed to ensure that either all the databases are updated or none of them, so that
the databases remain synchronized.
 In two phase commit protocol there is one node which is act as a coordinator or
controlling site and all other participating node are known as cohorts or participant or
slave.
 Coordinator (controlling site) – the component that coordinates with all the participants.
 Cohorts (Participants/Slaves) – each individual node except coordinator are participant.

56
 As the name suggests, the two phase commit protocol involves two phases.
1. Commit request phase OR Prepare phase
2. Commit/Abort phase

57
Request to prepare
Commit/Abort
Prepared
Done
Prepare
Phase
Commit
Phase
Coordinator send
request asking for ready
to commit
Participant send reply
whether ready to
commit or not
Coordinator inform to do
commit
Send “ack” to inform
whether commit done or
not

58
1. Commit Request Phase (Obtaining Decision)
• After each slave has locally completed its transaction, it sends a “DONE” message to the controlling
site.
• When the controlling site has received “DONE” message from all slaves, it sends a “Prepare” (prepare
to commit) message to the slaves.
• The slaves vote on whether they still want to commit or not.
• If a slave wants to commit, it sends a “Ready” message.
• A slave that does not want to commit sends a “Not Ready” message.
• This may happen when the slave has conflicting concurrent transactions or there is a timeout.

59
2. Commit Phase (Performing Decision)
1) After the controlling site has received “Ready” message from all the slaves:
• The controlling site sends a “Global Commit” message to the slaves.
• The slaves commit the transaction and send a “Commit ACK” message to the controlling site.
• When the controlling site receives “Commit ACK” message from all the slaves, it considers the
transaction as committed.

60
2. Commit Phase (Performing Decision)
2) After the controlling site has received the first “Not Ready” message from any slave:
• The controlling site sends a “Global Abort” message to the slaves.
• The slaves abort the transaction and send a “Abort ACK” message to the controlling site.
• When the controlling site receives “Abort ACK” message from all the slaves, it considers the
transaction as aborted.

61
Database recovery
 There are many situations in which a transaction may not reach a commit or abort point.
• Operating system crash
• DBMS crash
• System might lose power (power failure)
• Disk may fail or other hardware may fail (disk/hardware failure)
• Human error
 In any of above situations, data in the database may become inconsistent or lost.

62
Database recovery
 For example, if a transaction has completed 30 out of 40 write instructions to the
database when the DBMS crashes, then the database may be in an inconsistent state as
only part of the transaction’s work was completed.
 Database recovery is the process of restoring the database and the data to a consistent
state.
 This may include restoring lost data up to the point of the event (e.g. system crash).

63
Log based recovery method
 The log is a sequence of log records, which maintains information about update
activities on the database.
 A log is kept on stable storage (i.e HDD).
 Log contains
1. Start of transaction
2. Transaction-id
3. Record-id
4. Type of operation (insert, update, delete)
5. Old value, new value
6. End of transaction that is committed or aborted.

64
Log based recovery method
 When transaction Ti starts, it registers itself by writing a record
<Ti start> to the log
 Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is the value
of X before the write (the old value), and V2 is the value to be written to X (the new
value).
 When Ti finishes it last statement, the log record <Ti commit> is written.
 Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
 Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
 Types of log based recovery method
1. Immediate database modification
2. Deferred database modification

65
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
Updates (changes) to the database are
applied immediately as they occur without
waiting to reach to the commit point.
deferred (postponed) until the transaction
commits.
T1 T2
Read (A)
A = A - 100
Write (A)
Read (B)
B = B + 100
Write (B)
Commit
Read (C)
C = C - 200
Write (C)
Commit
A=500, B=600, C=700 A=500, B=600, C=700
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
A=400,B=700,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
A=500,B=600,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
<T1, Commit>
<T2 start>
<T2, C, 500>
A=400,B=700,C=700
<T1 start>
<T1, A, 400>
<T1, B, 700>
<T1, Commit>
<T2 start>
<T2, C, 500>
<T2, Commit>
A=400,B=700,C=500
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
<T1, Commit>
<T2 start>
<T2, C, 700, 500>
A=400,B=700,C=500
<T1 start>
<T1, A, 500, 400>
<T1, B, 600, 700>
<T1, Commit>
<T2 start>
<T2, C, 700, 500>
<T2, Commit>
A=400,B=700,C=500

66
Immediate v/s Deferred database modification
Immediate database modification Deferred database modification
applied immediately as they occur without
waiting to reach to the commit point.
deferred (postponed) until the transaction
commits.
If transaction is not committed, then we
need to do undo operation and restart the
transaction again.
If transaction is not committed, then no
need to do any undo operations. Just restart
the transaction.
If transaction is committed, then no need to
do redo the updates of the transaction.
If transaction is committed, then we need to
do redo the updates of the transaction.
Undo and Redo both operations are
performed.
Only Redo operation is performed.

67
Problems with Deferred & Immediate Updates
 Searching the entire log is time consuming.
1. Immediate database modification
• When transaction fail log file is used to undo the updates of transaction.
2. Deferred database modification
• When transaction commits log file is used to redo the updates of transaction.
 To reduce the searching time of entire log we can use check point.

68
Checkpoint
 It is a point which specifies that any operations executed before it are done correctly
and stored safely (updated safely in database).
 At this point, all the buffers are force-fully written to the secondary storage (database).
 Checkpoints are scheduled at predetermined time intervals.
 It is used to limit:
• Size of transaction log file
• Amount of searching

69
Checkpoint works when failure occurs
 At failure time:
• Ignore the transaction T1 as it has already been committed before checkpoint.
• Redo transaction T2 and T3 as they are active after checkpoint and are committed before failure.
• Undo transaction T4 as it is active after checkpoint and has not committed.
Time TC Tf
T1
T2
T3
T4
Checkpoint time Failure

70
What is lock?
 A lock is a variable associated with data item to control concurrent access to that data
item.
Database
0
Lock variable
Locking is a strategy that is used to
prevent such concurrent access of data.
1

71
Lock based protocol
 Data items can be locked in two modes :
1. Shared (S) mode: When we take this lock we can just read the item but cannot write.
2. Exclusive (X) mode: When we take this lock we can read as well as write the item.
 Lock-compatibility matrix
Shared lock Exclusive lock
Shared lock Yes
Compatible
No
Not Compatible
Exclusive lock No
Not Compatible
No
Not Compatible
T1
T2

72
Lock based protocol
 A transaction may be granted a lock on an item if the requested lock is compatible with
locks already held on the item by other transactions.
 If a lock cannot be granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released. The lock is then
granted.
 Any number of transactions can hold shared locks on an item, but if any transaction
holds an exclusive on the item no other transaction can hold any lock on the item.

73
Lock based protocol
 This locking protocol divides transaction execution phase into three parts:
1. When transaction starts executing, create a list of data items on which they need locks and requests
the system for all the locks it needs.
2. Where the transaction acquires all locks and no other lock is required. Transaction keeps executing
its operation.
3. As soon as the transaction releases its first lock, the third phase starts. In this phase a transaction
cannot demand for any lock but only releases the acquired locks.
Transaction
T begin T end Time
Lock acquisition
phase
Lock releasing
phase
Transaction
execution

74
Two phase locking protocol
 This protocol works in two phases,
1. Growing Phase
• In this phase a transaction obtains locks, but can not release any lock.
• When a transaction takes the final lock is called lock point.
2. Shrinking Phase
• In this phase a transaction can release locks, but can not obtain any lock.
• The transaction enters the shrinking phase as soon as it releases the first lock after crossing the Lock
Point.
Transaction
T begin T end Time
Growing phase Shrinking phase

75
Time stamp based protocol
 This protocol uses either system time or logical counter to be used as a time-stamp.
 Every transaction has a time-stamp associated with it and the ordering is determined by
the age of the transaction.
 A transaction ‘T1’ created at 0002 clock time would be older than all other transaction,
which come after it.
 For example, any transaction ‘T2' entering the system at 0004 is two seconds younger
than transaction ‘T1’ and priority is given to the older one.
 In addition, every data item is given the latest read and write time-stamp. This lets the
system know, when last read and write operations was made on the data item.

76
Time stamp ordering protocol
 This is the responsibility of the protocol system that the conflicting pair of tasks should be
executed according to the timestamp values of the transactions.
• Time-stamp of Transaction Ti is denoted as TS(Ti).
• Read time-stamp of data-item X is denoted by R-timestamp(X).
• Write time-stamp of data-item X is denoted by W-timestamp(X).

77
Time stamp ordering protocol
 Timestamp ordering protocol works as follows:
• If a transaction Ti issues read(X) operation:
• If TS(Ti) < W-timestamp(X)
– Operation rejected.
• If TS(Ti) >= W-timestamp(X)
– Operation executed.
• If a transaction Ti issues write(X) operation:
• If TS(Ti) < R-timestamp(X)
– Operation rejected.
• If TS(Ti) < W-timestamp(X)
– Operation rejected and Ti rolled back.
• Otherwise, operation executed.

78
What is deadlock?
 Consider the following two transactions:
 A deadlock is a situation in which two or more transactions are waiting for one another
to give up locks.
T1 T2
Granted for (A)
Waiting for (B)
Granted for (B)
Waiting for (A)
Lock-X (A)
Write (A)
Lock-X (B)
Write (B)
Lock-X (B)
Write (B)
Lock-X (A)
Write (A)

79
Deadlock detection
 A simple way to detect deadlock is with the help of wait-for graph.
 One node is created in the wait-for graph for each transaction that is currently executing.
 Whenever a transaction Ti is waiting to lock an item X that is currently locked by a
transaction Tj, a directed edge from Ti to Tj (Ti→Tj) is created in the wait-for graph.
 When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is
dropped from the wait-for graph.
 We have a state of deadlock if and only if the wait-for graph has a cycle.
 Then each transaction involved in the cycle is said to be deadlocked.

80
• Transaction A is waiting for transactions B
and C.
• Transactions C is waiting for transaction B.
• Transaction B is waiting for transaction D.
• This wait-for graph has no cycle, so there is
no deadlock state.
• Suppose now that transaction D is
requesting an item held by C. Then the
edge D C is added to the wait-for
graph.
Deadlock detection
B
A
D
C

81
• Now this graph contains the cycle.
• B D C B
• It means that transactions B, D and C are
all deadlocked.
Deadlock detection
B
A
D
C
D
E
A
D
L
O
C
K

82
Deadlock recovery
 When a deadlock is detected, the system must recover from the deadlock.
 The most common solution is to roll back one or more transactions to break the
deadlock.
 Choosing which transaction to abort is known as Victim Selection.

83
• In this wait-for graph transactions B, D
and C are deadlocked.
• In order to remove deadlock one of the
transaction out of these three (B, D, C)
transactions must be roll backed.
• We should rollback those transactions
that will incur the minimum cost.
• When a deadlock is detected, the choice
of which transaction to abort can be
made using following criteria:
 The transaction which have the fewest locks
 The transaction that has done the least work
 The transaction that is farthest from
completion
Choice of deadlock victim
B
A
D
C
D
E
A
D
L
O
C
K

84
Deadlock prevention
 A protocols ensure that the system will never enter into a deadlock state.
 Some prevention strategies :
• Require that each transaction locks all its data items before it begins execution (predeclaration).
• Impose partial ordering of all data items and require that a transaction can lock data items only in the
order specified by the partial.

85
Deadlock prevention
 Following schemes use transaction timestamps for the sake of deadlock prevention alone.
1. Wait-die scheme — non-preemptive
• If an older transaction is requesting a resource which is held by younger transaction, then older
transaction is allowed to wait for it till it is available.
• If an younger transaction is requesting a resource which is held by older transaction, then younger
transaction is killed.
Wait-Die
O needs a resource held by Y O waits
Y needs a resource held by O Y dies

86
Deadlock prevention
2. Wound-wait scheme — preemptive
• If an older transaction is requesting a resource which is held by younger transaction, then older
transaction forces younger transaction to kill the transaction and release the resource.
• If an younger transaction is requesting a resource which is held by older transaction, then younger
transaction is allowed to wait till older transaction will releases it.
Wound-Wait
O needs a resource held by Y Y dies
Y needs a resource held by O Y waits

87
Deadlock prevention
3. Timeout-Based Schemes :
• A transaction waits for a lock only for a specified amount of time. After that, the wait times out and
the transaction is rolled back. So deadlocks never occur.
• Simple to implement; but difficult to determine good value of the timeout interval.

88
Multiversion Concurrency Control Techniques

 Other protocols for concurrency control keep the old values of a data item when the item
is updated. These are known as multiversion concurrency control, because several
versions (values) of an item are maintained.
 When a transaction requires access to an item, an appropriate version is chosen to
maintain the serializability of the currently executing schedule, if possible.
 The idea is that some read operations that would be rejected in other techniques can still
be accepted by reading an older version of the item to maintain serializability. When a
transaction writes an item, it writes a new version and the old version(s) of the item are
retained. Some multiver-sion concurrency control algorithms use the concept of view
serializability rather than conflict serializability.

89
Drawback of Multiversion techniques
 More storage is needed to maintain multiple versions of the database items.
 In addition, some database applications require older versions to be kept to maintain a
history of the evolution of data item values.
 The extreme case is a temporal database , which keeps track of all changes and the times
at which they occurred. In such cases, there is no additional storage penalty for
multiversion techniques, since older versions are already maintained.

90
Multiversion concurrency control schemes
1. Multiversion Technique Based on Timestamp Ordering
2. Multiversion Two-Phase Locking Using Certify Locks

Transactions in database systems and functions

More Related Content

Similar to Transactions in database systems and functions

Recently uploaded

Transactions in database systems and functions