KEMBAR78
16 Multi Version Concurrency Control | PDF | Software Engineering | Information Management
0% found this document useful (0 votes)
91 views66 pages

16 Multi Version Concurrency Control

Multi-version concurrency control (MVCC) allows multiple physical versions of data to exist concurrently. It avoids locking by allowing transactions to read past versions of rows while preserving a consistent snapshot. Key aspects of MVCC include maintaining version chains through tuple pointers, storing versions together or separately, and timestamp-based visibility rules to determine which versions a transaction can see.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views66 pages

16 Multi Version Concurrency Control

Multi-version concurrency control (MVCC) allows multiple physical versions of data to exist concurrently. It avoids locking by allowing transactions to read past versions of rows while preserving a consistent snapshot. Key aspects of MVCC include maintaining version chains through tuple pointers, storing versions together or separately, and timestamp-based visibility rules to determine which versions a transaction can see.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Multi-Version Concurrency Control

Lecture 16: Multi-Version Concurrency Control

1 / 66
Multi-Version Concurrency Control Recap

Recap

2 / 66
Multi-Version Concurrency Control Recap

Optimistic Concurrency Control

• The DBMS creates a private workspace for each txn.


▶ Any object read is copied into workspace.
▶ Modifications are applied to workspace.
• When a txn commits, the DBMS compares workspace write set to see whether it
conflicts with other txns.
• If there are no conflicts, the write set is installed into the global database.

3 / 66
Multi-Version Concurrency Control Recap

OCC Phases

• Phase 1 – Read:
▶ Track the read/write sets of txns and store their writes in a private workspace.
• Phase 2 – Validation:
▶ When a txn commits, check whether it conflicts with other txns.
• Phase 3 – Write:
▶ If validation succeeds, apply private changes to database. Otherwise abort and restart the
txn.

4 / 66
Multi-Version Concurrency Control Recap

Today’s Agenda

• Multi-Version Concurrency Control


• Design Decisions
▶ Concurrency Control Protocol
▶ Version Storage
▶ Garbage Collection
▶ Index Management

5 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

Multi-Version Concurrency Control

6 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

Multi-Version Concurrency Control

• The DBMS maintains multiple physical versions of a single logical object in the
database:
▶ When a txn writes to an object, the DBMS creates a new version of that object (instead of
private workspace in OCC)
▶ When a txn reads an object, it reads the newest version that existed when the txn started.

7 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC HISTORY

• Protocol was first proposed in 1978 MIT PhD dissertation.


• First implementations was Rdb/VMS and InterBase at DEC in early 1980s.
▶ Both were by Jim Starkey, co-founder of NuoDB.
▶ DEC Rdb/VMS is now "Oracle Rdb"
▶ InterBase was open-sourced as Firebird.

8 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

Multi-Version Concurrency Control

• Writers don’t block readers. Readers don’t block writers.


• Read-only txns can read a consistent snapshot without acquiring locks.
▶ Use timestamps to determine visibility.
• Easily support time-travel queries.

9 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 1

10 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 1

11 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 1

12 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 1

13 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 1

14 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

15 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

16 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

17 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

18 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

19 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

20 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC – Example 2

21 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

Multi-Version Concurrency Control

• MVCC is more than just a Concurrency Control protocol.


• It completely affects how the DBMS manages transactions and the database.
• Examples: Oracle, SAP HANA, PostgreSQL, CockroachDB

22 / 66
Multi-Version Concurrency Control Multi-Version Concurrency Control

MVCC Design Decisions

• Concurrency Control Protocol


• Version Storage
• Garbage Collection
• Index Management

23 / 66
Multi-Version Concurrency Control Concurrency Control Protocol

Concurrency Control Protocol

24 / 66
Multi-Version Concurrency Control Concurrency Control Protocol

Concurrency Control Protocol

• Approach 1: Timestamp Ordering


▶ Assign txns timestamps that determine serial order.
• Approach 2: Optimistic Concurrency Control
▶ Three-phase protocol from last class.
▶ Use private workspace for new versions.
• Approach 3: Two-Phase Locking
▶ Txns acquire appropriate lock on physical version before they can read/write a logical
tuple.

25 / 66
Multi-Version Concurrency Control Version Storage

Version Storage

26 / 66
Multi-Version Concurrency Control Version Storage

Version Storage

• The DBMS uses the tuples’ pointer field to create a version chain per logical tuple.
▶ This allows the DBMS to find the version that is visible to a particular txn at runtime.
▶ Indexes always point to the head of the chain.
• Different storage schemes determine where/what to store for each version.

27 / 66
Multi-Version Concurrency Control Version Storage

Version Storage

• Approach 1: Append-Only Storage


▶ New versions are appended to the same table space.
• Approach 2: Time-Travel Storage
▶ Old versions are copied to separate table space.
• Approach 3: Delta Storage
▶ The original values of the modified attributes are copied into a separate delta record space.

28 / 66
Multi-Version Concurrency Control Version Storage

Append-Only Storage
• All of the physical versions of a logical tuple are stored in the same table space. The
versions are mixed together.
• On every update, append a new version of the tuple into an empty space in the table.

29 / 66
Multi-Version Concurrency Control Version Storage

Append-Only Storage
• All of the physical versions of a logical tuple are stored in the same table space. The
versions are mixed together.
• On every update, append a new version of the tuple into an empty space in the table.

30 / 66
Multi-Version Concurrency Control Version Storage

Append-Only Storage
• All of the physical versions of a logical tuple are stored in the same table space. The
versions are mixed together.
• On every update, append a new version of the tuple into an empty space in the table.

31 / 66
Multi-Version Concurrency Control Version Storage

Version Chain Ordering

• Approach 1: Oldest-to-Newest (O2N)


▶ Just append new version to end of the chain.
▶ Have to traverse chain on look-ups.
• Approach 2: Newest-to-Oldest (N2O)
▶ Have to update index pointers for every new version.
▶ Don’t have to traverse chain on look ups.

32 / 66
Multi-Version Concurrency Control Version Storage

Time-Travel Storage

• On every update, copy the current version to the time-travel table. Update pointers.
• Overwrite master version in the main table. Update pointers.

33 / 66
Multi-Version Concurrency Control Version Storage

Time-Travel Storage

• On every update, copy the current version to the time-travel table. Update pointers.
• Overwrite master version in the main table. Update pointers.

34 / 66
Multi-Version Concurrency Control Version Storage

Time-Travel Storage

• On every update, copy the current version to the time-travel table. Update pointers.
• Overwrite master version in the main table. Update pointers.

35 / 66
Multi-Version Concurrency Control Version Storage

Time-Travel Storage

• On every update, copy the current version to the time-travel table. Update pointers.
• Overwrite master version in the main table. Update pointers.

36 / 66
Multi-Version Concurrency Control Version Storage

Delta Storage

• On every update, copy only the values that were modified to the delta storage and
overwrite the master version.
• Txns can recreate old versions by applying the delta in reverse order.

37 / 66
Multi-Version Concurrency Control Version Storage

Delta Storage

• On every update, copy only the values that were modified to the delta storage and
overwrite the master version.
• Txns can recreate old versions by applying the delta in reverse order.

38 / 66
Multi-Version Concurrency Control Version Storage

Delta Storage

• On every update, copy only the values that were modified to the delta storage and
overwrite the master version.
• Txns can recreate old versions by applying the delta in reverse order.

39 / 66
Multi-Version Concurrency Control Version Storage

Delta Storage

• On every update, copy only the values that were modified to the delta storage and
overwrite the master version.
• Txns can recreate old versions by applying the delta in reverse order.

40 / 66
Multi-Version Concurrency Control Garbage Collection

Garbage Collection

41 / 66
Multi-Version Concurrency Control Garbage Collection

Garbage Collection

• The DBMS needs to remove reclaimable physical versions from the database over
time.
▶ No active txn in the DBMS can see that version (SI).
▶ The version was created by an aborted txn.
• Two additional design decisions:
▶ How to look for expired versions?
▶ How to decide when it is safe to reclaim memory?

42 / 66
Multi-Version Concurrency Control Garbage Collection

Garbage Collection

• Approach 1: Tuple-level
▶ Find old versions by examining tuples directly.
▶ Background Vacuuming vs. Cooperative Cleaning
• Approach 2: Transaction-level
▶ Txns keep track of their old versions so the DBMS does not have to scan tuples to
determine visibility.

43 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

• Background Vacuuming:
• Separate thread(s) periodically scan the table and look for reclaimable versions.
• Works with any storage.

44 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

45 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

46 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

47 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

48 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

• Cooperative Cleaning:
• Worker threads identify reclaimable versions as they traverse version chain.
• Only works with O2N.

49 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

50 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

51 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

52 / 66
Multi-Version Concurrency Control Garbage Collection

Tuple-level GC

53 / 66
Multi-Version Concurrency Control Garbage Collection

Transaction-level GC

• Each txn keeps track of its read/write set.


• The DBMS determines when all versions created by a finished txn are no longer visible.

54 / 66
Multi-Version Concurrency Control Index Management

Index Management

55 / 66
Multi-Version Concurrency Control Index Management

Index Management

• Primary key indexes point to version chain head.


▶ How often the DBMS has to update the pkey index depends on whether the system
creates new versions when a tuple is updated.
▶ If a txn updates a tuple’s pkey attribute(s), then this is treated as an DELETE followed by
an INSERT.
• Secondary indexes are more complicated. . .

56 / 66
Multi-Version Concurrency Control Index Management

Secondary Indexes

• Approach 1: Physical Pointers


▶ Use the physical address to the version chain head.
• Approach 2: Logical Pointers
▶ Use a fixed identifier per tuple that does not change.
▶ Requires an extra indirection layer.
▶ Primary Key vs. Tuple Id

57 / 66
Multi-Version Concurrency Control Index Management

Physical Pointers

58 / 66
Multi-Version Concurrency Control Index Management

Physical Pointers

59 / 66
Multi-Version Concurrency Control Index Management

Physical Pointers

60 / 66
Multi-Version Concurrency Control Index Management

Physical Pointers

61 / 66
Multi-Version Concurrency Control Index Management

Logical Pointers

62 / 66
Multi-Version Concurrency Control Index Management

Logical Pointers

63 / 66
Multi-Version Concurrency Control Index Management

MVCC Implementations

DBMS Protocol Version Storage Garbage Collection Indexes


Oracle MV2PL Delta Vacuum Logical
Postgres MV-2PL/MV-TO Append-Only Vacuum Physical
MySQL-InnoDB MV-2PL Delta Vacuum Logical
HYRISE MV-OCC Append-Only – Physical
Hekaton MV-OCC Append-Only Cooperative Physical
MemSQL MV-OCC Append-Only Vacuum Physical
SAP HANA MV-2PL Time-travel Hybrid Logical
NuoDB MV-2PL Append-Only Vacuum Logical
HyPer MV-OCC Delta Txn-level Logical

64 / 66
Multi-Version Concurrency Control Index Management

Conclusion

• MVCC is the widely used scheme in DBMSs.


• Even systems that do not support multi-statement txns (e.g., NoSQL) use it.

65 / 66
Multi-Version Concurrency Control Index Management

Next Class

• Advanced topics in Concurrency Control

66 / 66

You might also like