Data RecoveRy & PRotection
1. Transaction Recovery
Transaction Recovery is crucial in databases to maintain data
consistency and integrity. When a transaction—a series of
database operations executed as a single unit of work—fails
due to errors, power failures, or system crashes, recovery
ensures that the database is restored to a consistent state.
Transactions follow the ACID properties:
• Atomicity: Each transaction is an indivisible unit, where
either all operations succeed or none do.
• Consistency: A transaction moves the database from
one valid state to another.
• Isolation: Intermediate states of a transaction are not
visible to other transactions.
• Durability: Committed transactions persist even if the
system crashes.
2. System Recovery
System Recovery deals with recovering the entire system
after a crash or failure. It involves restoring the database to
its most recent consistent state, allowing it to resume regular
operations. Common techniques include:
• Checkpointing: Regularly saving the state of the
database in logs. During recovery, the system replays or
rolls back transactions from the last checkpoint.
• Log-Based Recovery: Keeping a record of all transactions
in logs (Write-Ahead Logging or WAL). In case of failure,
the log helps recover completed and uncompleted
transactions.
3. Media Recovery
Media Recovery focuses on restoring data lost due to
physical damage to the storage media (e.g., hard disk failure).
It involves using backups and logs to restore the lost data.
Techniques include:
• Full Backups: Complete copies of the database taken
periodically.
• Incremental Backups: Only changes since the last
backup are saved, requiring less storage.
• Redo Logs: Replaying logged operations from a backup
to reconstruct the database.
4. Concurrency Control Techniques
Concurrency control ensures that multiple transactions can
occur simultaneously without conflicting with each other. It
preserves data consistency while allowing parallel access to
the database.
Key Techniques:
• Locking: Prevents multiple transactions from accessing
the same data simultaneously in ways that could cause
conflicts.
• Timestamp Ordering: Assigns timestamps to each
transaction, ensuring transactions occur in a strict order.
• Multiversion Concurrency Control (MVCC): Keeps
multiple versions of data to handle read and write
operations simultaneously.
5. Locking
Locking is the most common concurrency control technique.
It prevents conflicting access to data by "locking" data items
for a particular transaction. Types of locks include:
• Shared Lock (S): Allows multiple transactions to read a
data item but prevents them from modifying it.
• Exclusive Lock (X): Restricts a data item so only one
transaction can read or write it.
Locks help prevent conflicts but may lead to a deadlock if not
managed carefully.
6. Deadlock
Deadlock occurs when two or more transactions hold locks
and wait for each other’s resources indefinitely. For example,
Transaction A holds a lock on Resource X and waits for
Resource Y, while Transaction B holds a lock on Resource Y
and waits for Resource X. This cyclic waiting state prevents
both transactions from progressing.
Deadlock Handling Strategies:
1. Deadlock Prevention: Avoiding conditions that lead to
deadlock.
2. Deadlock Detection: Monitoring transactions and
detecting deadlock cycles.
3. Deadlock Resolution: Aborting or rolling back one
or more transactions to break the cycle.
7. Serializability
Serializability is the concept of arranging transactions in a
sequence that ensures the outcome is the same as if
transactions were executed serially, one after the other.
Ensuring serializability prevents conflicts and maintains data
consistency.
Types of Serializability:
1. Conflict Serializability: Transactions do not conflict with
each other based on read/write operations.
2. View Serializability: More relaxed than conflict
serializability, it ensures the final state is consistent,
even if the intermediate operations differ.
8. Database Security: Introduction
Database Security refers to protecting databases from
unauthorized access, threats, and attacks, ensuring data
privacy, integrity, and availability. Database security
encompasses a range of measures, from user authentication
to encryption and access control policies.
Key Elements of Database Security:
• Authentication: Verifying the identity of users before
they access the database.
• Authorization: Assigning permissions to users based on
their roles.
• Encryption: Encrypting sensitive data to protect it from
unauthorized access.
• Access Control: Implementing rules to restrict access to
specific data and actions.