KEMBAR78
Module 5 | PDF | Database Transaction | Operating System Technology
0% found this document useful (0 votes)
14 views90 pages

Module 5

The document discusses transaction processing and recovery in database systems, defining transactions as atomic units of work that must either complete fully or not at all. It outlines key concepts such as the states of transactions, the importance of system logs for recovery, and the ACID properties that transactions should uphold. Additionally, it covers different types of schedules, including serial and serializable schedules, and emphasizes the significance of recoverable schedules for ensuring data integrity during transaction processing.

Uploaded by

ishaan agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views90 pages

Module 5

The document discusses transaction processing and recovery in database systems, defining transactions as atomic units of work that must either complete fully or not at all. It outlines key concepts such as the states of transactions, the importance of system logs for recovery, and the ACID properties that transactions should uphold. Additionally, it covers different types of schedules, including serial and serializable schedules, and emphasizes the significance of recoverable schedules for ensuring data integrity during transaction processing.

Uploaded by

ishaan agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

Transaction processing and

Recovery
Dr. Jyotismita Chaki
Introduction to Transaction Processing
• A transaction is an executing program that forms a logical unit of database
processing.
• A transaction includes one or more database access operations—these can
include insertion, deletion, modification (update), or retrieval operations.
• Transaction processing systems are systems with large databases and
hundreds of concurrent users executing database transactions.
• Examples of such systems include airline reservations, banking, credit card
processing, online retail purchasing, stock markets, supermarket checkouts,
and many other applications.
• These systems require high availability and fast response time for hundreds
of concurrent users.
Transaction and System Concepts
• A transaction is an atomic unit of work that should either be completed in its
entirety or not done at all.
• For recovery purposes, the system needs to keep track of when each transaction
starts, terminates, and commits, or aborts.
• Therefore, the recovery manager of the DBMS needs to keep track of the
following operations:
• BEGIN_TRANSACTION. This marks the beginning of transaction execution.
• READ or WRITE. These specify read or write operations on the database items that are
executed as part of a transaction.
• END_TRANSACTION. This specifies that READ and WRITE transaction operations have ended
and marks the end of transaction execution.
• COMMIT_TRANSACTION. This signals a successful end of the transaction so that any changes
(updates) executed by the transaction can be safely committed to the database and will not
be undone.
• ROLLBACK (or ABORT). This signals that the transaction has ended unsuccessfully, so that any
changes or effects that the transaction may have applied to the database must be undone.
Transaction and System Concepts
• A transaction goes into an active state immediately after it starts execution,
where it can execute its READ and WRITE operations.
• When the transaction ends, it moves to the partially committed state.
• At this point, some types of concurrency control protocols may do
additional checks to see if the transaction can be committed or not.
• If these checks are successful, the transaction is said to have reached its
commit point and enters the committed state.
• However, a transaction can go to the failed state if one of the checks fails or
if the transaction is aborted during its active state.
• The transaction may then have to be rolled back to undo the effect of its
WRITE operations on the database.
• The terminated state corresponds to the transaction leaving the system.
Transaction and System Concepts

State transition diagram illustrating the states for transaction execution


The System Log
• To be able to recover from failures that affect transactions, the system
maintains a log to keep track of all transaction operations that affect the
values of database items, as well as other transaction information that may
be needed to permit recovery from failures.
• The log is a sequential, append-only file that is kept on disk, so it is not
affected by any type of failure except for disk or catastrophic failure.
• Typically, one (or more) main memory buffers, called the log buffers, hold
the last part of the log file, so that log entries are first added to the log
main memory buffer.
• When the log buffer is filled, or when certain other conditions occur, the
log buffer is appended to the end of the log file on disk.
• In addition, the log file from disk is periodically backed up to archival
storage (tape) to guard against catastrophic failures.
The System Log
• The following are the types of entries—called log records—that are written
to the log file and the corresponding action for each log record.
• In these entries, T refers to a unique transaction-id that is generated
automatically by the system for each transaction and that is used to
identify each transaction:
1. [start_transaction, T]. Indicates that transaction T has started execution.
2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has changed
the value of database item X from old_value to new_value.
3. [read_item, T, X]. Indicates that transaction T has read the value of database item
X.
4. [commit, T]. Indicates that transaction T has completed successfully, and affirms
that its effect can be committed (recorded permanently) to the database.
5. [abort, T]. Indicates that transaction T has been aborted.
Desirable Properties of Transactions
• Transactions should possess several properties, often called the ACID properties;
they should be enforced by the concurrency control and recovery methods of the
DBMS.
• The following are the ACID properties:
• Atomicity. A transaction is an atomic unit of processing; it should either be performed in its
entirety or not performed at all.
• Consistency preservation. A transaction should be consistency preserving, meaning that if it
is completely executed from beginning to end without interference from other transactions,
it should take the database from one consistent state to another.
• Isolation. A transaction should appear as though it is being executed in isolation from other
transactions, even though many transactions are executing concurrently. That is, the
execution of a transaction should not be interfered with by any other transactions executing
concurrently.
• Durability or permanency. The changes applied to the database by a committed transaction
must persist in the database. These changes must not be lost because of any failure.
Desirable Properties of Transactions
• The atomicity property requires that we execute a transaction to
completion.
• It is the responsibility of the transaction recovery subsystem of a
DBMS to ensure atomicity.
• If a transaction fails to complete for some reason, such as a system
crash in the midst of transaction execution, the recovery technique
must undo any effects of the transaction on the database.
• On the other hand, write operations of a committed transaction must
be eventually written to disk.
Desirable Properties of Transactions
• The preservation of consistency is generally considered to be the
responsibility of the programmers who write the database programs
and of the DBMS module that enforces integrity constraints.
• A consistent state of the database satisfies the constraints specified in
the schema as well as any other constraints on the database that
should hold.
• A database program should be written in a way that guarantees that,
if the database is in a consistent state before executing the
transaction, it will be in a consistent state after the complete
execution of the transaction, assuming that no interference with
other transactions occurs.
Desirable Properties of Transactions
• The isolation property is enforced by the concurrency control
subsystem of the DBMS.
• If every transaction does not make its updates (write operations)
visible to other transactions until it is committed, one form of
isolation is enforced that solves the temporary update problem and
eliminates cascading rollbacks but does not eliminate all other
problems.
• The durability property is the responsibility of the recovery
subsystem of the DBMS.
Serial and Serializable Schedules
• When transactions are executing concurrently in an interleaved fashion,
then the order of execution of operations from all the various transactions
is known as a schedule (or history).
• Serial Schedule:
• A schedule where transactions are executed one after another, without any
interleaving.
• Ensures that the database remains consistent and avoids anomalies like lost updates,
dirty reads, and cascading aborts.
• Easy to understand and implement, but can be inefficient for concurrent execution.
• Serializable Schedule:
• A schedule that is equivalent to a serial schedule in terms of its effect on the
database.
• May involve interleaving of transactions, but the final state of the database is the
same as if the transactions were executed serially.
• Guarantees consistency and avoids anomalies, but can be more complex to achieve.
Serial and Serializable Schedules
• Consider two transactions:
• T1: R(A) W(A), T2: R(B) W(B) R(A) W(A)
• A serial schedule could be:
• T1: R(A) W(A) C, T2: R(B) W(B) R(A) W(A) C
• A serializable schedule could be:
• T1: R(A) W(A) R(B) W(B), T2: R(B) W(B) R(A) W(A)

Feature Serial Schedule Serializable Schedule


Interleaving No Possible
Consistency Guaranteed Guaranteed
Efficiency Less efficient Can be more efficient
Complexity Simpler More complex
Types of Schedules
Types of Schedules
T1 T2 T1 T2
R(A) R(A)

W(A) W(A)

R(B) W(A)

W(B) R(A)

R(A) commit

R(B) commit

Serial Schedule Recoverable schedule


Types of Schedules

T1 T2 Schedules in which transactions read values


only after all transactions whose changes they
R(A)
are going to read commit are called cascadeless
W(A) schedules. Avoids that a single transaction
abort leads to a series of transaction rollbacks.
W(A) A strategy to prevent cascading aborts is to
commit disallow a transaction from reading
uncommitted changes from another
R(A) transaction in the same schedule.
commit

Cascadeless Schedule
Types of Schedules
• A schedule is strict if for any two T1 T2
transactions Ti, Tj, if a write operation R(A)
of Ti precedes a conflicting operation
of Tj (either read or write), then the R(A)

commit or abort event of Ti also W(A)


precedes that conflicting operation of
commit
Tj.
In other words, Tj can read or write W(A)
updated or written value of Ti only R(A)
after Ti commits/aborts.
commit

Strict Schedule
Characterizing schedules based on
recoverability
• Operations from different transactions can be interleaved in the
schedule S.
• However, for each transaction Ti that participates in the schedule S,
the operations of Ti in S must appear in the same order in which they
occur in Ti.
• The order of operations in S is considered to be a total ordering,
meaning that for any two operations in the schedule, one must occur
before the other.
• It is possible theoretically to deal with schedules whose operations
form partial orders, but we will assume for now total ordering of the
operations in a schedule.
Characterizing schedules based on
recoverability
• For the purpose of recovery and concurrency control, we are mainly interested in
the read_item (Reads a database item named X into a program variable also
named X) and write_item (Writes the value of program variable X into the
database item named X) operations of the transactions, as well as the commit
and abort operations.
• A shorthand notation for describing a schedule uses the symbols b, r, w, e, c, and
a for the operations begin_transaction, read_item, write_item, end_transaction,
commit, and abort, respectively, and appends as a subscript the transaction id
(transaction number) to each operation in the schedule.
• In this notation, the database item X that is read or written follows the r and w
operations in parentheses.
• In some schedules, we will only show the read and write operations, whereas in
other schedules we will show additional operations, such as commit or abort.
Characterizing schedules based on recoverability:
Interpretation of shorthand notations
• b<Transaction id> → E.g: b1 [begin Transaction T1].
• r<Transaction id>(<data item>) → E.g: r1(X) [Transaction T1 is performing
read_item on data item X].
• w<Transaction id>(<data item>) → E.g: w1(X) [Transaction T1 is performing
write_item on data item X].
• c<Transaction id> → E.g: c1 [Transaction T1 has committed].
• a<Transaction id> → E.g: a1 [Transaction T1 has aborted or rolled back].
Characterizing schedules based on
recoverability: read_item and write_item
• Executing a read_item(X) command includes the following steps:
1.Find the address of the disk block that contains item X.
2.Copy the disk block into a buffer in main memory if that disk is not already in some
main memory buffer.
3.Copy item X from the buffer to the program variable named X.
• Executing a write_item(X) command includes the following steps:
1.Find the address of the disk block that contains item X.
2.Copy the disk block into a buffer in main memory if that disk is not already in some
main memory buffer.
3.Copy item X from the program variable named X into its correct location in the
buffer.
4.Store the updated block from the buffer back to disk (either immediately or at some
later point in time).
Characterizing schedules based on
recoverability
• The schedule in the following Figure , which we shall call Sa, can be
written as follows in this notation:
• Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);
Characterizing schedules based on
recoverability
• Similarly, the schedule for the
following Figure, which we call
Sb, can be written as follows, if
we assume that transaction T1
aborted after its read_item(Y)
operation:
• Sb: r1(X); w1(X); r2(X); w2(X); r1(Y);
a1;
Characterizing schedules based on recoverability:
Conflicting Operations in a Schedule
• Two operations in a schedule are said to conflict if they satisfy all three of
the following conditions:
1. they belong to different transactions;
2. they access the same item X; and
3. at least one of the operations is a write_item(X).
• For example, in schedule Sa, the operations r1(X) and w2(X) conflict, as do
the operations r2(X) and w1(X), and the operations w1(X) and w2(X).
• However, the operations r1(X) and r2(X) do not conflict, since they are both
read operations; the operations w2(X) and w1(Y) do not conflict because
they operate on distinct data items X and Y; and the operations r1(X) and
w1(X) do not conflict because they belong to the same transaction.
Characterizing schedules based on recoverability:
Conflicting Operations in a Schedule
• Two operations are conflicting if changing their order can result in a
different outcome.
• For example, if we change the order of the two operations r1(X); w2(X) to
w2(X); r1(X), then the value of X that is read by transaction T1 changes,
because in the second ordering the value of X is read by r1(X) after it is
changed by w2(X), whereas in the first ordering the value is read before it is
changed. This is called a read-write conflict.
• The other type is called a write-write conflict where we change the order
of two operations such as w1(X); w2(X) to w2(X); w1(X). For a write-write
conflict, the last value of X will differ because in one case it is written by T2
and in the other case by T1.
• Two read operations are not conflicting because changing their order
makes no difference in outcome.
Characterizing schedules based on
recoverability
• For some schedules it is easy to recover from transaction and system
failures, whereas for other schedules the recovery process can be
quite involved.
• In some cases, it is even not possible to recover correctly after a
failure.
• Hence, it is important to characterize the types of schedules for which
recovery is possible, as well as those for which recovery is relatively
simple.
• These characterizations do not actually provide the recovery
algorithm; they only attempt to theoretically characterize the
different types of schedules.
Characterizing schedules based on
recoverability
• Once a transaction T is committed, it should never be necessary to
roll back T. This ensures that the durability property of transactions is
not violated.
• The schedules that theoretically meet this criterion are called
recoverable schedules.
• A schedule where a committed transaction may have to be rolled
back during recovery is called nonrecoverable and hence should not
be permitted by the DBMS.
Characterizing schedules based on
recoverability
• Consider the following schedule-
• Here,
• T2 performs a dirty read (Reading
from an uncommitted transaction is
called as a dirty read) operation.
• The commit operation of T2 is
delayed till T1 commits or roll backs.
• T1 commits later.
• T2 is now allowed to commit.
• In case, T1 would have failed, T2 has a
chance to recover by rolling back.
Characterizing schedules based on
recoverability
• Consider the following schedule-
• Here
• T2 performs a dirty read operation.
• T2 commits before T1.
• T1 fails later and roll backs.
• The value that T2 read now stands to
be incorrect.
• T2 can not recover since it has
already committed.
Characterizing schedules based on
recoverability
• The condition for a recoverable schedule is as follows:
• A schedule S is recoverable if no transaction T in S commits until all transactions T′
that have written some item X that T reads have committed.
• A transaction T reads from transaction T′ in a schedule S if some item X is first
written by T′ and later read by T.
• In addition, T′ should not have been aborted before T reads item X, and there should
be no transactions that write X after T′ writes it and before T reads it (unless those
transactions, if any, have aborted before T reads X).
• Some recoverable schedules may require a complex recovery process, but
if sufficient information is kept (in the log), a recovery algorithm can be
devised for any recoverable schedule.
• The schedules Sa and Sb from the preceding section are both recoverable,
since they satisfy the above definition.
Characterizing schedules based on
recoverability
• Consider the schedule Sa′ given below, which is the same as schedule
Sa except that two commit operations have been added to Sa:
• Sa′: r1(X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1;
• Sa′ is recoverable, even though it suffers from the lost update
problem.
• However, consider the two schedules Sc and Sd that follow:
• Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
• Sc is not recoverable because T2 reads item X from T1, but T2
commits before T1 commits. The problem occurs if T1 aborts after the
c2 operation in Sc; then the value of X that T2 read is no longer valid
and T2 must be aborted after it is committed, leading to a schedule
that is not recoverable.
Characterizing schedules based on
recoverability
• For the schedule to be recoverable, the c2 operation in Sc must be
postponed until after T1 commits, as shown in Sd.
• Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;
• If T1 aborts instead of committing, then T2 should also abort as
shown in Se, because the value of X it read is no longer valid.
• Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;
• In Se, aborting T2 is acceptable since it has not committed yet, which
is not the case for the nonrecoverable schedule Sc.
Characterizing schedules based on
recoverability: Types of recoverable schedule
• In a recoverable schedule, no committed transaction ever needs to be
rolled back, and so the definition of a committed transaction as
durable is not violated.
• However, it is possible for a phenomenon known as cascading
rollback (or cascading abort) to occur in some recoverable schedules,
where an uncommitted transaction has to be rolled back because it
read an item from a transaction that failed.
• This is illustrated in schedule Se, where transaction T2 has to be rolled
back because it read item X from T1, and T1 then aborted.
Characterizing schedules based on
recoverability
• Here,
• Transaction T2 depends on transaction
T1.
• Transaction T3 depends on transaction
T2.
• Transaction T4 depends on transaction
T3.
• In this schedule,
• The failure of transaction T1 causes
the transaction T2 to rollback.
• The rollback of transaction T2 causes
the transaction T3 to rollback.
• The rollback of transaction T3 causes
the transaction T4 to rollback.
• Such a rollback is called as a Cascading
Rollback.
Characterizing schedules based on
recoverability: Types of recoverable schedule
• Because cascading rollback can be time-consuming—since numerous
transactions can be rolled back —it is important to characterize the
schedules where this phenomenon is guaranteed not to occur.
• A schedule is said to be cascadeless, or to avoid cascading rollback, if
every transaction in the schedule reads only items that were written by
committed transactions.
• In this case, all items read will not be discarded because the transactions
that wrote them have committed, so no cascading rollback will occur.
• To satisfy this criterion, the r2(X) command in schedules Sd and Se must be
postponed until after T1 has committed (or aborted), thus delaying T2 but
ensuring no cascading rollback if T1 aborts.
Characterizing schedules based on
recoverability
• Cascadeless schedule allows only
committed read operations.
• However, it allows uncommitted
write operations.
Characterizing schedules based on
recoverability: Types of recoverable schedule
• Finally, there is a third, more restrictive type of schedule, called a
strict schedule, in which transactions can neither read nor write an
item X until the last transaction that wrote X has committed (or
aborted).
• Strict schedules simplify the recovery process.
• In a strict schedule, the process of undoing a write_item(X) operation
of an aborted transaction is simply to restore the before image
(old_value or BFIM) of data item X.
• This simple procedure always works correctly for strict schedules, but
it may not work for recoverable or cascadeless schedules.
Characterizing schedules based on
recoverability
• For example, consider schedule Sf:
• Sf: w1(X, 5); w2(X, 8); a1;
• Suppose that the value of X was originally 5, which is the before image
stored in the system log along with the w1(X, 5) operation.
• If T1 aborts, as in Sf, the recovery procedure that restores the before image
of an aborted write operation will restore the value of X to 5, even though
it has already been changed to 8 by transaction T2, thus leading to
potentially incorrect results.
• Although schedule Sf is cascadeless, it is not a strict schedule, since it
permits T2 to write item X even though the transaction T1 that last wrote X
had not yet committed (or aborted).
• A strict schedule does not have this problem.
Characterizing schedules based on
recoverability
• Any strict schedule is also cascadeless, and any cascadeless schedule is also
recoverable.
• Suppose we have i transactions T1, T2, … , Ti, and their number of
operations are n1, n2, … , ni, respectively.
• If we make a set of all possible schedules of these transactions, we can
divide the schedules into two disjoint subsets: recoverable and
nonrecoverable.
• The cascadeless schedules will be a subset of the recoverable schedules,
and the strict schedules will be a subset of the cascadeless schedules.
• Thus, all strict schedules are cascadeless, and all cascadeless schedules are
recoverable.
Characterizing Schedules Based on
Serializability
• A schedule (or history) S of n
transactions T1, T2, … , Tn is an ordering
of the operations of the transactions.
• Operations from different transactions
can be interleaved in the schedule S.
• We characterize the types of schedules
that are always considered to be correct
when concurrent transactions are
executing.
• Such schedules are known as serializable
schedules. Suppose that two users—for (a) Transaction T1. (b) Transaction T2.
example, two airline reservations
agents—submit to the DBMS
transactions T1 and T2 as shown in
Figure at approximately the same time.
Characterizing Schedules Based on
Serializability
• If no interleaving of operations is permitted, there are only two
possible outcomes:
• Execute all the operations of transaction T1 (in sequence) followed by all the
operations of transaction T2 (in sequence).
• Execute all the operations of transaction T2 (in sequence) followed by all the
operations of transaction T1 (in sequence).
• These two schedules—called serial schedules.
• Therefore, in a serial schedule, only one transaction at a time is
active—the commit (or abort) of the active transaction initiates
execution of the next transaction.
Characterizing Schedules Based on
Serializability

Serial schedule A: T1 followed by T2. Serial schedule B: T2 followed by T1


Characterizing Schedules Based on
Serializability
• If interleaving of operations is allowed, there will be many possible
orders (non-serial schedules) in which the system can execute the
individual operations of the transactions. Two possible schedules are
shown in the Figure.
Characterizing Schedules Based on
Serializability
• The definition of serializable schedule is as follows: A schedule S of n
transactions is serializable if it is equivalent to some serial schedule of the
same n transactions.
• Two schedules are called result equivalent if they produce the same final
state of the database.
• Two operations in a schedule are said to conflict if they belong to different
transactions, access the same database item, and either both are
write_item operations or one is a write_item and the other a read_item.
• If two conflicting operations are applied in different orders in two
schedules, the effect can be different on the database or on the
transactions in the schedule, and hence the schedules are not conflict
equivalent.
Characterizing Schedules Based on
Serializability
• Consider 2 schedules,
Schedule1 and
Schedule2.
• Schedule2 (a non-
serial schedule) is
considered to be
conflict serializable
when its conflict
operations are the
same as that of
Shedule1 (a serial
schedule).
Characterizing Schedules Based on
Serializability
• Two schedules (one being serial
schedule and another being non-
serial) are said to be view
serializable if they satisfy the rules
for being view equivalent to one
another.
• The rules to be upheld are:
1. Initial values of the data items involved
within a schedule must be the same.
2. Final values of the data items involved
within a schedule must be the same.
3. The number of WR operations
performed must be equivalent for the
schedules involved.
Testing for Serializability of a Schedule
• There is a simple algorithm for determining whether a particular schedule
is (conflict) serializable or not.
• The algorithm looks at only the read_item and write_item operations in a
schedule to construct a precedence graph (or serialization graph), which is
a directed graph G = (N, E) that consists of a set of nodes N = {T1, T2, … , Tn }
and a set of directed edges E = {e1, e2, … , em }.
• There is one node in the graph for each transaction Ti in the schedule. Each
edge ei in the graph is of the form (Tj → Tk ), 1 ≤ j ≤ n, 1 ≤ k ≤ n, where Tj is
the starting node of ei and Tk is the ending node of ei.
• In the precedence graph, an edge from Ti to Tj means that transaction Ti
must come before transaction Tj in any serial schedule that is equivalent to
S
Testing for Serializability of a Schedule
• Testing Serializability of a Schedule S:
• For each transaction Ti participating in schedule S, create a node labeled Ti in
the precedence graph.
• For each case in S where Tj executes a read_item(X) after Ti executes a
write_item(X), create an edge (Ti → Tj) in the precedence graph.
• For each case in S where Tj executes a write_item(X) after Ti executes a
read_item(X), create an edge (Ti → Tj) in the precedence graph.
• For each case in S where Tj executes a write_item(X) after Ti executes a
write_item(X), create an edge (Ti → Tj) in the precedence graph.
• The schedule S is serializable if and only if the precedence graph has no
cycles.
Testing for Serializability of a Schedule

(a) Precedence graph for


serial schedule A. (b)
Precedence graph for
serial schedule B.
Testing for Serializability of a Schedule

(c) Precedence graph for


schedule C (not
serializable). (d)
Precedence graph for
schedule D (serializable,
equivalent to schedule
A).
Testing for Serializability of a Schedule
• If the precedence graph has a cycle, it is easy to show that we cannot
create any equivalent serial schedule, so S is not serializable.
• The graph for schedule C has a cycle, so it is not serializable.
• The graph for schedule D has no cycle, so it is serializable, and the
equivalent serial schedule is T1 followed by T2.
• The graphs for schedules A and B have no cycles, as expected,
because the schedules are serial and hence serializable.
Testing for Serializability of a Schedule
• Another example, in which three transactions participate

The read and write operations of three transactions T1, T2, and T3.
Testing for Serializability of a Schedule
Testing for Serializability of a Schedule

Precedence graph for schedule E.


Testing for Serializability of a Schedule
Testing for Serializability of a Schedule

Precedence graph for schedule F.


Testing for Serializability of a Schedule

Precedence graph with two equivalent serial schedules


Testing for Serializability of a Schedule
Time T4 T5 T6
t1 Read (A)
t2 A:= F1 (A)
t3 Read (C)
T4
t4 Write (A)
t5 A:= F2 (A) C
t6 Read (B) A
t7 Write (C)
t8 Read (A)
t9 Read (C)
t10 B:= F3 (B)
T5 B T6
t11 Write (B)
t12 C:= F4 (C)
t13 Read (B)
t14 Write (C) Equivalent serial schedule: T4 → T5 → T6
t15 A:= F5 (A)
t16 Write (A)
t17 B:= F6 (B)
t18 Write (B)
Testing for Serializability of a Schedule
Time T1 T2 T3
t1 Read (A)
t2 Read (B)
t3 A:= F1 (A)
t4 Read (C) T1
t5 B:= F2 (B)
t6 Write (B) C
t7 C:= F3 (C) A
t8 Write (C)
t9 Write (A)
t10 Read (B)
t11 Read (A) T2 B T3
t12 A:= F4 (A)
t13 Read (C)
t14 Write (A)
t15 C:= F5 (C) Equivalent serial schedule: NONE
Reason: Cycle A (T1 → T2), B (T2 → T3), C (T3 → T1)
t16 Write (C)
t17 B:= F6 (B)
t18 Write (B)
Are the following three schedules result
equivalent?
Solution
• To check whether the given schedules are result equivalent or not,
• We will consider some arbitrary values of X and Y.
• Then, we will compare the results produced by each schedule.
• Those schedules which produce the same results will be result equivalent.
• Let X = 2 and Y = 5.
• On substituting these values, the results produced by each schedule are-
• Results by Schedule S1- X = 21 and Y = 10
• Results by Schedule S2- X = 21 and Y = 10
• Results by Schedule S3- X = 11 and Y = 10
• Clearly, the results produced by schedules S1 and S2 are same.
• Thus, we conclude that S1 and S2 are result equivalent schedules.
Are the following two schedules conflict
equivalent?
Solution
• To check whether the given schedules are conflict equivalent or not,
• We will write their order of pairs of conflicting operations.
• Then, we will compare the order of both the schedules.
• If both the schedules are found to have the same order, then they will be conflict equivalent.
• For schedule S1-
• The required order is-
• R1(A) , W2(A)
• W1(A) , R2(A)
• W1(A) , W2(A)
• For schedule S2-
• The required order is-
• R1(A) , W2(A)
• W1(A) , R2(A)
• W1(A) , W2(A)
• Clearly, both the given schedules have the same order.
• Thus, we conclude that S1 and S2 are conflict equivalent schedules.
Recovery concepts
• Recovery from transaction failures usually means that the database is restored to
the most recent consistent state before the time of failure.
• To do this, the system must keep information about the changes that were
applied to data items by the various transactions.
• This information is typically kept in the system log.
• A typical strategy for recovery may be summarized informally as follows:
• If there is extensive damage to a wide portion of the database due to catastrophic failure,
such as a disk crash, the recovery method restores a past copy of the database that was
backed up to archival storage (typically tape or other large capacity offline storage media)
and reconstructs a more current state by reapplying or redoing the operations of committed
transactions from the backed-up log, up to the time of failure.
• When the database on disk is not physically damaged, and a noncatastrophic failure has
occurred, the recovery strategy is to identify any changes that may cause an inconsistency in
the database.
Recovery concepts
• For recovery from any type of failure data values prior to modification
(BFIM - BeFore Image) and the new value after modification (AFIM –
AFter Image) are required.
• These values and other information is stored in a sequential file called
system log.
T ID Back P Next
T IDP Operation Data
Back P Next P item BFIM
Operation Data AFIM
item BFIM AFIM
T1 0 1
T1 Begin
0 1 Begin
T1 1 4
T1 Write
1 4 X Write X = 100 X = 200X = 100 X = 200
T2 0 8
T2 Begin
0 8 Begin
T1 2 5
T1 2W 5 Y WY = 50 Y = 100Y = 50 Y = 100
T1 4 7
T1 4R 7 M RM = 200 M = 200M = 200 M = 200
T3 0 9
T3 0R 9 N RN = 400 NN = 400N = 400 N = 400
T1 5 nil
T1 5End nil End
Recovery concepts
• To maintain atomicity, a transaction’s operations are redone or
undone.
• Undo: Restore all BFIMs on to disk (Remove all AFIMs).
• Redo: Restore all AFIMs on to disk.
• Database recovery is achieved either by performing only Undos or
only Redos or by a combination of the two.
Recovery concepts: Checkpointing
• From time to time the database flushes its buffer to database disk
to minimize the task of recovery.
• Following steps defines a checkpoint operation:
1. Suspend execution of transactions temporarily.
2. Force write modified buffer data to disk (unless a no-UNDO).
3. Write a [checkpoint] record to the log, save the log to disk.
4. Resume normal transaction execution.
• During recovery redo or undo is required to transactions appearing
after [checkpoint] record.
• If the recovery system sees a log with <Tn, Start> and <Tn, Commit>
or just <Tn, Commit>, it puts the transaction in the redo-list.
• If the recovery system sees a log with <Tn, Start> but no commit or
abort log found, it puts the transaction in undo-list.
Recovery concepts: Checkpointing
• Any transaction that was running at
the time of failure needs to be
undone and restarted.
• Any transactions that committed
since the last checkpoint need to be
redone.
• Transactions of type T1 need no
recovery.
• Transactions of type T3 or T5 need
to be undone and restarted.
• Transactions of type T2 or T4 need
to be redone.
Recovery concepts: Checkpointing
Recovery concepts: Checkpointing
Recovery concepts: Checkpointing
Recovery concepts: Checkpointing
Recovery concepts: Checkpointing
Recovery based on deferred update
• Conceptually, we can distinguish two main policies for recovery from noncatastrophic
transaction failures: deferred update and immediate update. (A catastrophic failure is a
sudden and complete breakdown that is impossible to recover from.)
• The deferred update techniques do not physically update the database on disk until a
transaction commits; then the updates are recorded in the database.
• Before reaching commit, all transaction updates are recorded in the local transaction
workspace or in the main memory buffers that the DBMS maintains.
• Before commit, the updates are recorded persistently in the log file on disk, and then
after commit, the updates are written to the database from the main memory buffers.
• If a transaction fails before reaching its commit point, it will not have changed the
database on disk in any way, so UNDO is not needed.
• It may be necessary to REDO the effect of the operations of a committed transaction
from the log, because their effect may not yet have been recorded in the database on
disk.
• Hence, deferred update is also known as the NO-UNDO/REDO algorithm.
Recovery based on deferred update
• Two tables are required for implementing this protocol.
• Active table: All active transactions are entered in this table.
• Commit table: Transactions to be committed are entered in this table.
• These tables are filled by scanning through the log from the last
checkpoint during recovery.
• During recovery
• All transactions of the commit table are redone in the order they were
written to log, and
• All transactions of active tables are ignored.
Recovery based on deferred update: Single
Transaction
• A = 100, B = 200 New value

T1 System Log T1 System Log


Read (A) <start_transaction, T1> Read (A) <start_transaction, T1>
A:= A+100 <write_item, T1, A, 200> A:= A+100 <write_item, T1, A, 200>
Write (A) <write_item, T1, B, 400> Write (A) <write_item, T1, B, 400>
Read (B) <Commit, T1> Read (B)
B:= B+200 B:= B+200
• After commit, if T1 fails → • If T1 fails before
Write (B) Write (B)
REDO Commit → no REDO
Commit
• After Recovery of T1: Updated • After recovery of T1:
value A = 200, B = 400 Value of A = 100, B =
200
• After commit, the data stored in HD will be updated
Recovery based on deferred update: Multiple
Transaction
T2 and T3 are ignored
because they did not
reach their commit
points. T4 is redone
because its commit point
The READ and is after the last system
WRITE checkpoint.
operations of
four
transactions.

System log at the point of crash


Recovery techniques based on immediate
update
• In the immediate update techniques, the database may be updated by some
operations of a transaction before the transaction reaches its commit point.
• However, these operations must also be recorded in the log on disk by force-
writing before they are applied to the database on disk, making recovery still
possible.
• If a transaction fails after recording some changes in the database on disk but
before reaching its commit point, the effect of its operations on the database
must be undone; that is, the transaction must be rolled back.
• In the general case of immediate update, both undo and redo may be required
during recovery. This technique, known as the UNDO/REDO algorithm, requires
both operations during recovery and is used most often in practice.
• A variation of the algorithm where all updates are required to be recorded in the
database on disk before a transaction commits requires undo only, so it is known
as the UNDO/NO-REDO algorithm.
Recovery techniques based on immediate
update
• Theoretically, we can distinguish two main categories of immediate
update algorithms:
1. If the recovery technique ensures that all updates of a transaction are
recorded in the database on disk before the transaction commits, there is
never a need to REDO any operations of committed transactions. This is
called the UNDO/NO-REDO recovery algorithm. In this method, all updates
by a transaction must be recorded on disk before the transaction commits,
so that REDO is never needed.
2. If the transaction is allowed to commit before all its changes are written to
the database, we have the most general case, known as the UNDO/REDO
recovery algorithm. This is also the most complex technique, but the most
commonly used in practice.
Recovery techniques based on immediate
update Before commit (immediate after
New value writing to main memory), the
• A = 100, B = 200 Old value data stored in HD will be updated

T1 System Log T1 System Log


Read (A) <start_transaction, T1> Read (A) <start_transaction, T1>
A:= A+100 <write_item, T1, A, 100, 200> A:= A+100 <write_item, T1, A, 100, 200>
Write (A) <write_item, T1, B, 200, 400> Write (A) <write_item, T1, B, 200, 400>
Read (B) <Commit, T1> Read (B)
B:= B+200 B:= B+200 • If T1 fails before
Write (B) • After commit, if T1 fails → Write (B) Commit → UNDO
Commit REDO • After recovery of T1
• After Recovery of T1: Updated (old value is fetched
value A = 200, B = 400 from the log): Value of
A = 100, B = 200
Shadow paging
• This recovery scheme does not require the use of a log in a single-user
environment.
• In a multiuser environment, a log may be needed for the concurrency control
method.
• Shadow paging considers the database to be made up of a number of fixedsize
disk pages (or disk blocks)—say, n—for recovery purposes.
• A directory with n entries is constructed, where the i-th entry points to the i-th
database page on disk.
• The directory is kept in main memory if it is not too large, and all references—
reads or writes—to database pages on disk go through it.
• When a transaction begins executing, the current directory—whose entries point
to the most recent or current database pages on disk—is copied into a shadow
directory.
• The shadow directory is then saved on disk while the current directory is used by
the transaction.
Shadow paging
• During transaction execution, the shadow directory is never modified.
• When a write_item operation is performed, a new copy of the modified
database page is created, but the old copy of that page is not overwritten.
• Instead, the new page is written elsewhere—on some previously unused
disk block.
• The current directory entry is modified to point to the new disk block,
whereas the shadow directory is not modified and continues to point to
the old unmodified disk block.
• For pages updated by the transaction, two versions are kept.
• The old version is referenced by the shadow directory and the new version
by the current directory.
Shadow paging
Shadow paging
• To recover from a failure during transaction execution, it is sufficient to free
the modified database pages and to discard the current directory.
• The state of the database before transaction execution is available through
the shadow directory, and that state is recovered by reinstating the shadow
directory.
• The database thus is returned to its state prior to the transaction that was
executing when the crash occurred, and any modified pages are discarded.
• Committing a transaction corresponds to discarding the previous shadow
directory.
• Since recovery involves neither undoing nor redoing data items, this
technique can be categorized as a NO-UNDO/NO-REDO technique for
recovery.
Shadow paging: Example Main Memory
Current Current Shadow
Directory Directory Directory
Block Pointer Main Memory Block Pointer Block Pointer
No. No. No.
1 1 1
2 2 2
3 3 3

X = 20 Y = 10 Z = 40 Copy of
X = 20 Y = 10 Z = 40
SD
Shadow paging: Example
Main Memory Main Memory
Current Shadow Current Shadow
Directory Directory Directory Directory
Block Pointer X = 20 Block Pointer Block Pointer X = 10 Block Pointer
No. No. No. No.
1 1 1 1
2 2 2 2
3 3 3 3

Copy of Copy of
X = 20 Y = 10 Z = 40 X = 20 Y = 10 Z = 40
SD SD
Shadow paging: Example
Main Memory Main Memory
Current Shadow Current Shadow
Directory Directory Directory Directory
Block Pointer X = 10 Block Pointer Block Pointer X = 10 Block Pointer
No. No. No. No.
1 Y = 10 1 1 Y = 20 1
2 2 2 2
3 3 3 3

Copy of Copy of
X = 20 Y = 10 Z = 40 X = 20 Y = 10 Z = 40
SD SD
X = 10 X = 10 Y = 20
Shadow paging: Example
Main Memory Main Memory
Current Shadow Current
Directory Directory Directory
Block Pointer X = 10 Block Pointer Block Pointer X = 10
No. No. No.
1 Y = 20 1 1 Y = 20
2 2 2
Commit Commit
3 3 3

Copy of
X = 20 Y = 10 Z = 40 X = 20 Y = 10 Z = 40
SD
X = 10 Y = 20 X = 10 Y = 20
Shadow paging: Example
Main Memory
Current
Directory
Block Pointer X = 10
No.
1 Y = 20
2
Commit
Successful Transaction
3

Z = 40

X = 10 Y = 20
Shadow paging: Example
Main Memory
Main Memory
Shadow
Current Shadow Directory

Transaction Fails
Directory Directory X = 10 Block Pointer
Block Pointer X = 10 Block Pointer No.
No. No. Y = 20 1
1 Y = 20 1 2
2 2 3
3 3

Copy of
X = 20 Y = 10 Z = 40
Copy of SD
X = 20 Y = 10 Z = 40
SD
X = 10
No Undo No Redo

You might also like