Advanced Database System(CoSc2042)
Chapter – 3
    TRANSACTION PROCESSING
                                     1
Chapter Outline
1. Introduction to Transaction Processing
2. Transaction and System Concepts
3. Desirable Properties of Transactions
4. Characterizing Schedules based on Recoverability and
   SERIALlZABILlTY
5. Transaction Support in SQL
                                                          2
      Definition of transactions
— A transaction is a unit of a program execution that accesses
  and possibly modifies various data objects (tuples, relations).
— Transaction are units or sequences of work accomplished in a
  logical order, whether in a manual fashion by a user or
  automatically by some sort of a database program.
— A transaction (set of operations) may be specified in SQL, or
  may be embedded within a program.
                                                                    3
INTRODUCTION TO TRANSACTION PROCESSING
 Single-User System: At most one
 user at a time can use the system.
 Multiuser System: Many users can
 access the system concurrently.
 Concurrency: means allowing more
 than one transaction to run
 simultaneously on the same database.
 Interleaved processing: Concurrent       Figure shows- Interleaved processing versus
 execution of processes are interleaved   parallel processing of concurrent
 in a single CPU                          transactions.
 Parallel processing:
  – Processes are concurrently
    executed in multiple CPUs.
                                                                                4
 Transaction boundaries:
 Begin and End transaction.
 An application program may contain several transactions separated by;
     Begin and End transaction boundaries.
     Suppose a bank employee transfers $500 from A‘s account to B's account.
     This very simple and small transaction involves several low-level tasks.
                                                                                 5
    Simple Model of a Database
   A database - collection of named data items
   Granularity of data - a field, a record , or a whole disk block
   Basic operations are read and write
   read(A, x): assign value of database object A to variable x;
    write(x , A): write value of variable x to database object A
   Example: Let T1 be a transaction that transfers $500 from account A to
    account B.
    This transaction can be defined as :
                                       T1
                                                                             6
 READ/ WRITE OPERATIONS
 Basic unit of data transfer from the disk to the computer main
   memory is one block.
 In general, a data item (what is read or written) will be the field of
   some record in the database, although it may be a larger unit such as
   a record or even a whole block.
                                                                       7
write_item(X) command includes the following steps:
 Find the address of the disk block that contains item X.
 Copy that disk block into a buffer in main memory (if that disk
  block is not already in some main memory buffer).
 Copy item X from the program variable named X into its correct
  location in the buffer.
 Store the updated block from the buffer back to disk (either
  immediately or at some later point in time).
                                                                    8
PROPERTIES OF TRANSACTIONS
 Properties of a transaction generally called ACID
  properties.
 Those are;
   Atomicity
   Consistency preservation
   Isolation
   Durability (permanency)
                                                      9
                              Atomic transactions
• Atomicity: A transaction is an atomic unit of processing; it is either
  performed in its entirety or not performed at all.
• Example: John wants to move $200 from his savings account to his
  checking account.
1) Money must be subtracted from savings account.
2) Money must be added to checking count.
• If both happen, John and the bank are both happy.
   If neither happens, John and the bank are both happy.
   If only one happens, either John or the bank will be unhappy.
  John’s transfer must be all or nothing.
                                                                           10
Consistency
• A correct execution of the transaction must take the database from
  one consistent state to another.
  •Example: Wilma tries to withdraw $1000 from account 387.
                                                                       11
Transactions are consistent
 A transaction must leave the database in valid state.
    valid state == no constraint violations
 Constraint is a declared rule defining /specifying database states
 Constraints may be violated temporarily …
     but must be corrected before the transaction completes.
                                                                 12
Isolation
Example:
            13
 Durability
• Once a transaction changes the database and the changes are
  committed, these changes must never be lost because of subsequent
  failure.
                                                                      14
Concurrency Control
   Isolation (+ Consistency) => Concurrency Control
 Concurrency means allowing more than one transaction to run simultaneously on
 the same database.
 When several transactions run concurrently database consistency can be
destroyed.
 It is meant to coordinate simultaneous transactions while preserving data
  integrity.
 It is about to control the multi-user access of database.
                                                                              15
WHY CONCURRENCY CONTROL IS NEEDED?
 The Lost Update Problem.
    This occurs when two transactions that access the same database items have
     their operations interleaved in a way that makes the value of some database
     item incorrect.
                                        The update performed by T1 gets
                                        lost;
                                        possible solution:
                                        T1 locks/unlocks database object A
                                         ⇒ T2 cannot read A while A is
                                        modified by T1
                                                                             16
Example
 The Lost Update Problem
                     T1                   T2       State of X
                 read_item(X);                       20
                 X:= X+10;        read_item(X);      20
                                  X:= X+20;
                                  write_item(X);     40
                                  commit;
Lost update      write_item(X);                      30
                 commit;
     Changes of T2 are lost.
                                                                17
 The Temporary Update (or Dirty Read) Problem.
   This occurs when one transaction updates a database item and
    then the transaction fails for some reason, and the updated
    item is accessed by another transaction before it is changed
    back to its original value.
                                  T1 modifies db object, and then the
                                  transactionT1 fails for some reason.
                                  Meanwhile the modified db object,
                                  however, has been accessed by another
                                  transaction T2. Thus T2 has read data
                                  that “never existed”.
                                                                         18
   Example
  The Temporary Update/ Dirty Read Problem
                      T1                T2            State of X   sum
                read_item(X);                            20        0
                X:= X+10;
Dirty update    write_item(X);
                                   read_ item(X);        30
                                   sum:= sum+X;
                                   write_item(sum);                30
                X:=X+10;           commit;
                write_item(X);                           40
                commit;
      T2 sees dirty data of T1.
                                                                         19
 The Incorrect Summary Problem
 If one transaction is calculating an aggregate summary function on a
  number of records while other transactions are updating some of these
  records, the aggregate function may calculate some values before they
  are updated and others after they are updated.
 In this schedule, the total computed by T1 is wrong (off by 100).
 ⇒T1 must lock/unlock several db objects
                                                                    20
 Example
The Incorrect Summary Problem
                                                                          Let A=100
          T1               T2                  State of X   State of Y      sum=0
                       read_item(A);                                         0
                       sum:= sum+A;
                       write_item(A);
                       commit;                                              100
  read_item(X);                                   30
  X:= X-10;
  write_item(X);
  commit;             read_item(X);               20
                      sum:= sum+X;
                      read_item(Y);                         10
                      sum:= sum+Y;
  read_item(Y);                                             10
  Y:= Y+10;
  write_item(Y);                                            20
  commit;
                           Incorrect summary
 T2 reads X after 10 is subtracted and reads Y before 10 is added, hence incorrect
  summary.                                                                        21
 Unrepeatable read problem
 Here a transaction T1 reads the same item twice and the item is
  charged by another transaction T2 between the reads, T1 receives
  different values for its two reads of the same item.
                                                                     22
 WHY RECOVERY IS NEEDED: (WHAT CAUSES A
  TRANSACTION TO FAIL?)
   A computer failure (system crash)
   A transaction or system error
   Local errors or exception conditions detected by the
    transaction:
   Concurrency control enforcement
   Disk failure
   Physical problems and catastrophes
                                                           23
  TRANSACTION STATE
 Active state: the transaction is being executed. This is the initial state of
  every transaction.
 Partially committed state: transaction goes into partially committed
   state after the end of a transaction.
 Committed state: a transaction executes all its operations successfully.
 Failed state: checks made by the database recovery system fails.
 Aborted − If any of the checks fails and the transaction has reached a
   failed state.
 The database recovery module after a transaction aborts −
    Re-start the transaction
    Kill the transaction
                                                                            24
25
Operations
 Recovery manager keeps track of the following operations:
    begin_transaction
    read or write
    end_transaction
    commit_transaction
    rollback (or abort)
 Recovery techniques use the following operators:
    Undo
    Redo
                                                              26
THE SYSTEM LOG
 Log or Journal: The log keeps track of all transaction operations
  that affect the values of database items.
    Needed to permit recovery from transaction failures.
    The log is kept on disk, so it is not affected by any type of
      failure except for disk or catastrophic failure.
    Log is periodically backed up to archival storage (tape) to
      guard against such catastrophic failures.
    Have a unique transaction-id that is generated automatically
      by the system
                                                                     27
Types of log record
–[start_transaction, T] - Indicates that transaction T has started execution.
–[write_item, T, X, old_value, new_value] - Indicates that transaction T has
changed the value of database item X from old_value to new_value.
–[read_item, T, X] - Indicates that transaction T has read the value of database item
X.
–[commit, T] - Indicates that transaction T has completed successfully, and affirms that
its effect can be committed (recorded permanently) to the database.
–[abort, T] - Indicates that transaction T has been aborted.
                                                                                    28
Commit Point of a Transaction
 Definition: a transaction T reaches its commit point when,
    all its operations accessing DB are executed successfully and
      changes are recorded in the log.
    Beyond the commit point, the transaction is said to be
      committed, and its effect is assumed to be permanently recorded
      in the database.
    The transaction then writes an entry [commit,T] into the log.
                                                                     29
                                     Con..
 Undo (Roll Back) of transactions:
     Needed for transactions that have a [start_transaction,T] entry to the log
      but no commit entry [commit,T] into the log.
 Redoing transactions:
     Transactions that have commit entry in the log
     write entries are redone from log
 Force writing a log:
     Before a transaction reaches its commit point,
     Write log to disk
     This process is called force-writing the log file before committing a
      transaction.
                                                                              30
SCHEDULES
 When transactions are executing concurrently in an interleaved
  fashion, the order of execution of operations from various
  transactions, is known as a transaction schedule (or history).
                                 • Transaction Schedule reflects
                                    chronological order of operations
                                                                        31
Characterizing schedules based on
• Recoverability- How good is the system at recovering from errors?
• Serializability - find schedules that allow transactions to execute
   concurrently without interfering one another.
• Schedules classified on recoverability:
     Recoverable schedule- no transaction needs to be rolled back.
     Cascadeless schedule - every transaction reads only the items that are
       written by committed transactions.
     Schedules requiring cascaded rollback – uncommitted transactions that
       read an item from a failed transaction must be rolled back.
     Strict Schedules - a transaction can neither read nor write an item X
       until the last transaction that wrote X has committed.                  32
 Characterizing schedules based on Serializability
 Serial schedule
   • Transactions are ordered one after the other. Otherwise, the schedule is
     called nonserial schedule.
 Serializable schedule
   • A schedule is equivalent to some serial schedule of the same n
     transactions.
 Result equivalent
   • Two schedules are producing the same final state of the database.
 Conflict equivalent
   • The order of any two conflicting operations is the same in both schedules.
                                                                                33
Figure 3.2 Examples of serial and nonserial schedules involving transactions
T1 and T2. (a) Serial schedule A: T1 followed by T2. (b) Serial schedule B: T2
followed by T1. (c) Two nonserial schedules C and D with interleaving of
operations.
                                                                           34
Schedule Notation
•A more compact notation for schedules:
       b3, r3(Y), w3(Y), e3, c3                T3
                                               begin
                      r3(Y)                    read(Y)
                                               Y = Y+1
                                               write(Y)
        operation               data item
                                               end
                   transaction                 commit
    note: we ignore the computations on the local copies of the data when
    considering schedules (they're not interesting)
                                                                            35
Examples
A serial schedule is one in which the transactions do not overlap (in
time).
 b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,   These are all serial schedules for the
 b2,r2(X),w2(X),e2,c2,               three example transactions
 b3,r3(Y),w3(Y),e3,c3
                                     There are six possible serial schedules
 b2,r2(X),w2(X),e2,c2,               for three transactions
 b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
 b3,r3(Y),w3(Y),e3,c3                n! possible serial schedules for n
                                     transactions
 b2,r2(X),w2(X),e2,c2,
 b3,r3(Y),w3(Y),e3,c3,
 b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1
                                                                          36
• Types of Serializability
    – Conflict Serializability
    – View Serializability:
 Conflict serializable:
     A schedule S is said to be conflict serializable if it is conflict
        equivalent to some serial schedule S’.
    •   Being serializable is not the same as being serial
    •   Being serializable implies that the schedule is a correct schedule.
    •   It will leave the database in a consistent state.
    •   Serializability is hard to check.
• View serializability: definition of serializability based on view
   equivalence.                                                            37
• Interleaving of operations occurs in an operating system through some
  scheduler
• Difficult to determine before hand how the operations in a schedule will be
  interleaved.
Fig 3.3. Conflicts between operations of two transactions:
                                                                                38
                        Conflict Equivalence
• Two schedules are conflict equivalent if the order of any two conflicting operations is
   the same in both schedules.
 • Two operations conflict
 – they access the same data item (read or write)
 – if they belong to different transactions
 – at least one is a write
   T1: b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
                                                     conflicting operations:
                                                     r1(X),w2(X)
   T2: b2,r2(X),w2(X),e2,c2                          w1(X), r2(X)
                                                     w1(X), w2(X)
   – Find the conflicting operation?
                                                                                     39
  Example: Conflict Equivalence
 schedule 1:
  b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
  b2,r2(X),w2(X),e2,c2
schedule 2:         r1(X) < w2(X), w1(X) < r2(X), w1(X) <
                    w2(X)
  b2,r2(X),w2(X),
  b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1, e2,c2
                  w2(X) < r1(X), r2(X) < w1(X), w2(X) < w1(X)
 schedule 3:
 b1,r1(X),w1(X),
 b2,r2(X),w2(X),e2,c2, r1(Y),w1(Y),e1,c1,
                     r1(X) < w2(X), w1(X) < r2(X), w1(X) < w2(X)
Schedule1and schedule 3 are conflict equivalent schedule 2 is not
conflict equivalent to either schedule 1 or 3
                                                                    40
       Testing for Conflict Serializability
• Precedence graphs are a more efficient test
   – graph indicates a partial order on the transactions required
     by the order of the conflicting operations.
   – the partial order must hold in any conflict equivalent serial
     schedule
   – if there is a loop in the graph, the partial order is not possible
     in any serial schedule
   – if the graph has no loops, the schedule is conflict serializable
                                                                     41
Precedence Graph Examples: find the graph the conflict operation between the transactions?
schedule 3:
b1,r1(X),w1(X),
b2,r2(X),w2(X),e2,c2, r1(Y),w1(Y),e1,c1,
Find the conflict operations ?
                   r1(X) < w2(X), w1(X) < r2(X), w1(X) < w2(X)
                        r1(X) < w2(X)                                         arrows
                                                                             indicate
T1                                                              T2
                      w1(X) < r2(X)                                           that T1
                                                                           precedes T2
                       w1(X) < w2(X)
                   schedule 3 is conflict serializable
           it is conflict equivalent to some serial schedule
                        in which T1 precedes T2
                                                                                             42
        Precedence Graph Examples
schedule 2:
b2,r2(X),w2(X),        b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,e2,c2
              w2(X) < r1(X), r2(X) < w1(X), w2(X) < w1(X)
                           w2(X) < r1(X)
       T1                                               T2
                         r2(X) < w1(X)
                          w2(X) < w1(X)
             schedule 2 is conflict serializable
     it is conflict equivalent to some serial schedule
                  in which T2 precedes T1.
                                                             43
              Precedence Graph Examples
schedule 4:
b2,r2(X), b1,r1(X),w1(X),r1(Y),w1(Y),w2(X),e1,c1,e2,c2
            r1(X) < w2(X), r2(X) < w1(X), w1(X) < w2(X)
                          r1(X) < w2(X)
       T1                                                 T2
                        r2(X) < w1(X)
                          w1(X) < w2(X)
          schedule 4 is not conflict serializable
                there is no serial schedule
      in which T2 precedes T1 and T1 precedes T2
                                                               44
Transaction Support in SQL
• SQL Commands used to control transactions:
  – COMMIT: to save the changes.
  – ROLLBACK: to rollback the changes.
  – SAVEPOINT: creates points within groups of
    transactions in which to ROLLBACK
  – SET TRANSACTION: Places a name on a transaction.
                                                   45
Ch-3 Assignment
Q1. Using the precedence graph as a method of checking
Serializability based on this find the following questions
S: r1(x) r2(z) r3(x) r1(z) r2(y) r3(y) w1(x) w2(z) w3(y) w2(y)
   e1,c1,e2,c2,e3,c3
A. Find the Ordering of conflicting operations?
B. Is this schedule serializable?
C. Is the schedule correct?
                                                                 46
Q2. Consider the schedule given below, in which, transaction T1 transfers
money from account A to account B and in the meantime, transaction T2
calculates the sum of 3 accounts namely, A, B, and C. The third column shows
the account balances and calculated values after every instruction is executed.
Discuss what problem is found in the schedule and what will be the correct
value of Accounts A, B & C averages?
                                                                             47