Module 4 -- Transaction Concurrency -- DBMS


Index

  1. Transaction Concurrency
  2. ACID properties of Transaction
  3. Problems of Concurrency (Parallel Scheduling)
  4. Serializability of Schedulers
  5. Conflict Serializability Finding the Conflict Equivalent Schedule for a parallel schedule.
  6. Conflict serializability Using Precedence graphs to determine whether a schedule is conflict serializable or not
  7. View Serializability
  8. Pessimistic Concurrency Control (PCC) -- Basic overview
  9. 1. Shared-Exclusive Locking Protocol
  10. Drawbacks of Shared-Exclusive Locking
  11. 2. 2-Phase Locking (2PL)
  12. Drawbacks of 2PL
  13. Strict 2PL and Rigorous 2PL
  14. Timestamp Ordering Protocol (Pessimistic Concurrency Control)
  15. Optimistic Concurrency Control (OCC)
  16. 1. Timestamp Ordering (OCC) version
  17. 2. Multi-Version Concurrency Control (MVCC)
  18. Database Recovery
  19. Immediate vs Deferred Update in Database Recovery

Transaction Concurrency

https://www.youtube.com/watch?v=t5hsV9lC1rU&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=74

1. Definition of a Transaction

This sequence of operations forms one transaction.


2. Core Operations in a Database Transaction


3. Why Transactions Are Important in DBMS


4. Summarizing the Process of a Transaction

  1. Read Data:
    The necessary data is read from the hard disk into RAM.

  2. Perform Operations:
    The transaction performs its logical unit of work (e.g., deducting from one account, adding to another). These operations are processed in RAM for speed.

  3. Write Data:
    The updated values are stored in RAM as the transaction is processed.

  4. Commit:
    Finally, the commit operation saves the changes permanently to the hard disk.

  5. (Optional) Rollback:
    If any error occurs during the process, a rollback can revert the changes to maintain data consistency.


ACID properties of Transaction

https://www.youtube.com/watch?v=-GS0OxFJsYQ&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=75

1. Atomicity


2. Consistency


3. Isolation


4. Durability


Conclusion

The ACID properties ensure that database transactions are processed reliably and safely:

These principles are fundamental in DBMS design and are crucial for systems like banking applications, where accuracy and reliability of transactions are paramount.


States of a transaction

https://www.youtube.com/watch?v=ObwYFVLB_VI&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=76

Pasted image 20250315142406.png

1. Transaction States Overview


2. Active State


3. Partially Committed State


4. Commit State


5. Termination (Deallocation)


6. Failed (Abort) State and Rollback


Conclusion


Scheduling in DBMS

https://www.youtube.com/watch?v=1cbmhsSJRWc&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=77

1. Definition of a Transaction Schedule


2. Types of Schedules

The video outlines two main types of schedules:


3. Real-World Analogies and Examples


4. Throughput and Performance


Problems of Concurrency (Parallel Scheduling)

1. Dirty Read (Read After Write Problem)

https://www.youtube.com/watch?v=oP8yLMjmszE&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=78

Pasted image 20250315154448.png

A more detailed video on Dirty Read: https://www.youtube.com/watch?v=r_oI0jLqlVY&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=79


2. Incorrect Summary

Pasted image 20250315154556.png


3. Lost Update

Pasted image 20250315155534.png


4. Unrepeatable Read

Pasted image 20250315155643.png

A better video on unrepeatable read: https://www.youtube.com/watch?v=vwIeKYGXmjI&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=80


5. Phantom Read

Pasted image 20250315155709.png


Serializability of Schedulers

https://www.youtube.com/watch?v=s8QlJoL1G6w&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=83

1. What Is Serializability?


2. Serial Schedules vs. Parallel Schedules


3. Determining Serializability


4. Key Takeaways


Conflict Serializability: Finding the Conflict Equivalent Schedule for a parallel schedule.

https://www.youtube.com/watch?v=ckqDozxECp0&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=84

1. Conflict Equivalence Concept


✅ Definitions You Need to Know First

🔹 Schedule

A sequence of operations (read/write) from multiple transactions (like T1, T2...).

🔹 Conflict Operations

Two operations conflict if:

  1. They are from different transactions, AND
  2. They access the same data item, AND
  3. At least one of them is a write.

So, these are the conflict pairs:

These cannot be swapped.

🔹 Non-conflict pairs

These can be swapped safely in the schedule.


🔁 How to Check Conflict Equivalence

You’re usually given:

Steps:

  1. Identify conflicting and non-conflicting pairs.
  2. Start from the schedule S.
  3. Find adjacent non-conflict pairs.
  4. Swap them, one by one, to reorder S into S′.
  5. If you can transform S into S′ using only non-conflicting adjacent swaps, then they are conflict equivalent.

⚠️ Do not swap conflict pairs, or operations will become inconsistent.


Example Scenario (as per the video)

Let's say we have a schedule S

T1: Read(A), Write(A)
T2: Read(B)

And a schedule S':

T2: Read(B)
T1: Read(A), Write(A)

These look different in order, so the goal is to transform S into S′ by swapping only non-conflicting adjacent operations.

Step 1: Identify the conflict and non-conflict pairs.

Now we know that conflict pairs are:

Conflict vs. Non-Conflict Pairs

Pair Same data item? Conflict? Why?
R(A)R(A) Yes ❌ No conflict Both are just reading
R(A)W(A) Yes ✅ Conflict Read followed by write changes value
W(A)R(A) Yes ✅ Conflict Value is read after modification
W(A)W(A) Yes ✅ Conflict Write overwrites previous write
Any operations on different data items No ❌ No conflict No interference

Step 2: Swap the pairs step by step till you achieve your target schedule

So the flattened version of our original schedule S:

R1(A), W1(A), R2(B)

And our target schedule S':

R1(A), R2(B), W1(A)

Now we can just swap W1(A) with R2(B) .

But first we need to check (even though it's obvious in this scenario) if they are a non-conflict pair or not:

So we can swap the two operations.

Now new order:

R1(A), R2(B), W1(A)

This matches our target schedule S'.


✅ Final Conclusion:

Since we were able to convert S into S' using only adjacent non-conflicting swaps, they are:

Conflict Equivalent Schedules


🧠 Key Takeaways


Conflict serializability: Using Precedence graphs to determine whether a schedule is conflict serializable or not

https://www.youtube.com/watch?v=zv0ba0Iok1Y&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=85

Conflict serializability checks whether a schedule (sequence of operations from different transactions) can be rearranged into a serial schedule without changing the outcome.

Quick recap from graph theory, what is a Cycle in a graph and how to spot them?

A cycle in a graph is a path that starts and ends at the same vertex, and contains at least one other vertex. In simpler terms, it's a closed loop within the graph.

Pasted image 20250519224855.png


🛠️ Steps to Check Conflict Serializability

Let’s break it down step-by-step:

Step 1: Identify Transactions

Step 2: Draw Nodes

Step 3: Add Directed Edges for Conflicts

Check conflicting operations between different transactions only (not within the same transaction). Conflicts arise when:

  1. Read(X)Write(X) (RW)
  2. Write(X)Read(X) (WR)
  3. Write(X)Write(X) (WW)

Important: These operations must access the same data item (like X, Y, Z).

Draw an edge:

Step 4: Detect Cycles


Example

Let's say we are given this schedule which contains a bunch of transactions

T1: R(X)      → Read of X  
T2: R(Y)      → Read of Y  
T3: R(X)      → Read of X  
T1: R(Y)      → Read of Y  
T2: R(Z)      → Read of Z  
T3: W(Y)      → Write of Y  
T1: W(Z)      → Write of Z  
T2: W(Z)      → Write of Z

Step 1: Identify transactions

We see:

So we’ll have 3 nodes in the graph:

T1 T2 T3

Step 2: Check for conflicts and add edges

Let’s go line-by-line and check for conflicts.

🔍 Operation 1: T1: R(X)

Look for:


🔍 Operation 2: T2: R(Y)

Look for:

flowchart TD;
	T2-->T3

🔍 Operation 3: T3: R(X)

Look for:


🔍 Operation 4: T1: R(Y)

Look for:

flowchart TD;
	T2-->T3
	T1-->T3

🔍 Operation 5: T2: R(Z)

Look for:

flowchart TD;
	T2-->T3
	T1-->T3
	T2-->T1

🔍 Operation 6: T3: W(Y)

Look for:


🔍 Operation 7: T1: W(Z)

Look for:

flowchart TD;
	T2-->T3
	T1-->T3
	T2-->T1
	T1-->T2

🔍 Operation 8: T2: W(Z)

Look for:


So, the final precedence graph is:

flowchart TD;
	T2-->T3
	T1-->T3
	T2-->T1
	T1-->T2

Step 3: Check for Cycles

To be conflict serializable, the graph must be acyclic (no cycles).

Let’s trace:

⚠️ There is a cycle between T1 and T2
✅ So the schedule is NOT conflict serializable.


✅ Summary of Steps:

Step Action
1 Identify all transactions (T1, T2, T3)
2 Check each operation for conflicts (RW, WR, WW) with later ops
3 Add an edge from the first transaction → second for every conflict
4 Build the precedence graph
5 Detect cycles in the graph
6 If cycle exists → Not conflict serializable, else → Serializable

View Serializability

https://www.youtube.com/watch?v=8LKM_RWeroM&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=86

Let's first recap conflict serializability to understand why view serializability is important and why it may hold in cases where conflict serializability fails.

🔁 Part 1: Conflict Serializability Recap

Schedule S has 3 transactions:

These transactions have operations like Read(A) and Write(A) on the same variable A.

Conflict Serializability Check:

To check conflict serializability, we:

  1. Identify conflicting pairs:

    • Read(A) conflicts with Write(A)
    • Write(A) conflicts with both Read(A) and Write(A)
  2. Draw a Precedence Graph:

    • Nodes: T1, T2, T3
    • Directed edges represent the order imposed by conflicts.
  3. Cycle in the graph?

    • ✅ If NO cycles → schedule is conflict serializable.
    • ❌ If cycle existsnot conflict serializable.

In the video, there's a cycle between T1 and T2 (T1 → T2 → T1), so it's not conflict serializable.


🧠 Part 2: So... Is the schedule serializable at all?

Important clarification:

Just because a schedule is not conflict serializable, doesn’t mean it’s not serializable.

That’s where View Serializability comes in.


👁️ Part 3: View Serializability Explained

The idea of View Serializability is looser than Conflict Serializability. It checks if two schedules:

In the video:


📊 Concrete Example in the Video:

Let's walk through the data operations:

Initial value of A = 100

Now in the rearranged schedule:

➡️ In both schedules, final value of A is 0
➡️ The final write is by T3 in both
➡️ Read-from relationships and final effects are preserved

So even though the conflict graph has a cycle, the two schedules are view equivalent.


Example 2

We’ll work with three transactions:

Let’s say the schedule S is:

T1: R(A)     
T2:       W(A)   
T3:             W(A)
T1:                  W(A)

or better:

Step Operation
1 T1: R(A)
2 T2: W(A)
3 T3: W(A)
4 T1: W(A)

Step 1: Check Initial Reads

Step 2: Read-from Matches

Step 3: Final Write Match

So we can find at least one serial schedule (T2 → T3 → T1) that ends in the same final writer as this schedule.

Even if this schedule has conflicts (e.g., T1: R(A) before T2: W(A) → write-after-read), and might create a cycle in the precedence graph (making it not conflict serializable), it is still view serializable because:


✅ Key Summary:

Concept Meaning This Schedule
Conflict Serializable No cycle in precedence graph ❌ Not conflict serializable (has cycle)
View Serializable Same read-from & final writes ✅ Yes, view serializable
Final Output Same value of A (0) ✅ Yes

🎯 Why Use View Serializability?

Because conflict serializability is stricter than necessary.
In some cases (like this one), a schedule is logically correct and consistent, even though conflict rules say no.

Hence, View Serializability allows us to accept more schedules that are still safe and correct.

🔁 TL;DR

If conflict serializability fails (due to cycles), view serializability is the next step to check. It’s more powerful but harder to implement in practice. The key idea is to make sure that the “story” of what happened to the data is the same, even if the order of events is different.


Pessimistic Concurrency Control (PCC) -- Basic overview

These parts will be applicable in distributed systems module 3 as well.

Pasted image 20250523132946.png

https://www.geeksforgeeks.org/concurrency-control-in-distributed-transactions/

The Pessimistic Concurrency Control Mechanisms proceeds on assumption that, most of the transactions will try to access the same resource simultaneously. It's basically used to prevent concurrent access to a shared resource and provide a system of acquiring a Lock on the data item before performing any operation.

Algorithms that fall under this protocol:

1. Shared-Exclusive Locking Protocol

https://www.youtube.com/watch?v=94C0V7f2zm4&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=87

This is one of the basic concurrency control mechanisms in databases. Its main purpose is to:

🧠 Two Types of Locks:

  1. Shared Lock (S-lock):

    • Allows only reading.
    • Multiple transactions can hold a shared lock on the same data at the same time.
    • No changes can be made to the data.
  2. Exclusive Lock (X-lock):

    • Allows both reading and writing.
    • Only one transaction can hold an exclusive lock on a data item.
    • No other transaction can access the data in any form while this lock is held.

When to Use Which?

Examples:


Compatibility Table

This table determines whether a new lock request is allowed based on the currently held lock on the data item.

Currently Held (Grant) Requested Lock Allowed?
Shared Shared ✅ Yes
Shared Exclusive ❌ No
Exclusive Shared ❌ No
Exclusive Exclusive ❌ No

🔎 Why?


Lock Lifecycle

For each transaction:

  1. Lock the required data item(s) using S or X.
  2. Perform operations (read/write).
  3. Unlock when done.

This is typically implemented with the help of a lock manager inside the database that follows the compatibility rules.


Goal of the protocol


Drawbacks of Shared-Exclusive Locking

https://www.youtube.com/watch?v=UsqtDD1VriY&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=88

🔒 Shared-Exclusive Locking: Quick Recap


1. May Not Guarantee Serializability

"May not be sufficient to produce only serializable schedules."


2. May Not Ensure Recoverability (Irrecoverable Schedules Possible)

"May not be free from irrecoverability."


3. Possibility of Deadlock

"May not be free from deadlock."


4. Possibility of Starvation

"May not be free from starvation."


Summary of All Key Drawbacks

Drawback Description
❌ Not always serializable Locking doesn't guarantee serializable schedules
❌ Irrecoverable schedules Transactions may commit after reading dirty data, breaking consistency
⚠️ Deadlock possible Circular waiting on locks causes indefinite blocking
⚠️ Starvation possible Some transactions may never get the lock if others keep acquiring it

2. 2-Phase Locking (2PL)

https://www.youtube.com/watch?v=1pUaEDNLWi4&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=89

In simpler locking protocols (like Shared & Exclusive locks), inconsistencies or non-serializable schedules may occur due to poor coordination. 2PL adds structure to how and when locks are acquired or released during a transaction to avoid such problems.


🧩 2 Phases in 2PL

  1. 🔼 Growing Phase

    • You can acquire (take) locks (Shared or Exclusive).
    • You cannot release any locks.
    • Think of this as the “gather all resources you need” phase.
  2. 🔽 Shrinking Phase

    • You release locks.
    • You are not allowed to acquire any new locks.
    • Once a transaction releases its first lock, it enters this phase, and cannot go back.

🔁 The transition from growing to shrinking happens the moment a transaction releases any lock.


✅ How Does 2PL Achieve Serializability?

Let’s take an example:

In 2PL:

This creates a natural serial order — since T2 had to wait, it can be considered to have happened after T1.

So:
Order of lock acquisition determines order of transaction execution, even if they're overlapping in real time.


🧠 Key Concepts You Must Remember

Term Explanation
Growing Phase Only lock acquisition allowed
Shrinking Phase Only lock release allowed
Lock-Point The point where a transaction acquires its last lock (or equivalently, the point where it releases its first lock) — helps determine the serialization order
Compatibility Table Dictates whether two locks can coexist. E.g. Shared-Shared = ✅, Shared-Exclusive = ❌

⚙ Example Breakdown

Let’s say:

What happens?


🛑 Common Misconceptions Cleared


Drawbacks of 2PL

1. Irrecoverable Schedules

🔍 What it means:

A schedule is recoverable if a transaction T2 that reads data written by T1 commits only after T1 commits.

❌ Problem:

In 2PL, there is no rule that enforces this ordering of commits. So:

You cannot roll back T2 even though it read uncommitted/dirty data. This leads to an irrecoverable schedule, which violates atomicity.


2. Cascading Rollbacks

🔍 What it means:

A cascading rollback happens when a failure in one transaction forces multiple other transactions to roll back, due to them having read its data.

❌ Problem:

2PL allows this to happen:

This leads to a chain of rollbacks, which hurts performance and consistency.

💡 Even though 2PL controls how locks are acquired/released, it doesn’t stop reads from uncommitted writes, which causes the issue.


3. Deadlocks

🔍 What it means:

A deadlock occurs when two or more transactions wait for each other indefinitely to release locks.

❌ Problem:

2PL doesn’t avoid deadlocks. For example:

Now, both transactions are stuck—circular waiting—and neither can proceed.

Deadlocks are a classic problem in locking systems and need separate detection and recovery mechanisms, like:


4. Starvation

🔍 What it means:

Starvation happens when a transaction never gets a chance to execute because other transactions keep overtaking it.

❌ Problem:

Under 2PL, a transaction can be perpetually delayed in acquiring a lock if:

This leads to starvation of long-waiting transactions.


🔄 Summary of Drawbacks:

Issue Possible in 2PL? Why?
Irrecoverability ✅ Yes No commit-order enforcement
Cascading Rollbacks ✅ Yes No control on dirty reads
Deadlocks ✅ Yes No deadlock avoidance
Starvation ✅ Yes No fairness policy

✅ How to fix them?

To address these problems, stronger protocols can be used, such as:

  1. Strict 2PL:

    • All exclusive locks are held until commit/abort
    • Prevents cascading rollbacks
    • Guarantees recoverable and cascade-less schedules
  2. Rigorous 2PL:

    • All locks (shared and exclusive) are held till commit
    • Ensures both recoverability and serializability
  3. Timestamp Ordering / Optimistic Concurrency:

    • To avoid deadlocks entirely (no locks involved)

So this means that, what shared and exclusive locks did was give us two locks:

Enter 2-Phase locking:


Strict 2PL and Rigorous 2PL

Both these protocols build on the basics 2PL protocol and add their own rules on top.

🔐 Strict 2-Phase Locking (Strict 2PL)

🧱 Base Foundation:

📌 Additional Rule in Strict 2PL:


🛠 What Problems It Solves:

  1. Cascading Rollback:

    • If another transaction reads a value written by an uncommitted transaction, and the writer aborts → the reader must also roll back (cascading).
    • Strict 2PL prevents this by not allowing reads on uncommitted data.
  2. Irrecoverability:

    • A committed transaction reading from a transaction that later aborts → makes rollback impossible.
    • Prevented because reads can't happen until the writer commits.

🧠 Insight:


🧱 Rigorous 2-Phase Locking (Rigorous 2PL)

🔒 Even More Restrictive:

💡 Key Difference from Strict 2PL:


✅ What It Guarantees:


📊 Summary of Comparison:

Property Basic 2PL Strict 2PL Rigorous 2PL
Follows Growing/Shrinking? ✅ Yes ✅ Yes ✅ Yes
Holds exclusive locks till commit? ❌ No ✅ Yes ✅ Yes
Holds shared locks till commit? ❌ No ❌ No ✅ Yes
Prevents cascading rollback? ❌ No ✅ Yes ✅ Yes
Prevents irrecoverability? ❌ No ✅ Yes ✅ Yes
More strict? Least strict Moderate Most strict

🧱 What They Don’t Solve:


Timestamp Ordering Protocol (Pessimistic Concurrency Control)

What is the Timestamp Ordering Protocol?

Timestamp Ordering Protocol is a concurrency control method that ensures serializability by ordering transactions based on their timestamps.


🕒 Three Important Timestamps

For every data item (say A), we track:

  1. TS(Ti): Timestamp of transaction Ti.
  2. RTS(A): Read Timestamp of A → TS of the most recent transaction that successfully read A.
  3. WTS(A): Write Timestamp of A → TS of the most recent transaction that successfully wrote A.

🧠 Basic Principle

"Older transactions should not be affected by younger ones."

If a younger transaction tries to read/write something modified by an older transaction, it might be aborted to preserve the correct order of operations.


🔐 Rules of Timestamp Ordering Protocol

Let’s say transaction T wants to perform Read(A) or Write(A).

1. Read Rule

Transaction T with timestamp TS(T) wants to read A.

Why?
Because T is older, but A was written by a younger transaction, which violates the order.

Example:


2. Write Rule

Transaction T wants to write A.

Example:

Because someone already read A (T1), and if T2 writes now, it will create a non-serializable schedule.


⚖️ Summary Table

Operation Condition Action
Read(A) by T TS(T) ≥ WTS(A) ✅ Allow
TS(T) < WTS(A) ❌ Abort T
Write(A) by T TS(T) ≥ RTS(A) and TS(T) ≥ WTS(A) ✅ Allow
TS(T) < RTS(A) or TS(T) < WTS(A) ❌ Abort T

🔁 What Happens on Abort?

When a transaction is aborted, it's restarted with a new timestamp (i.e., treated as a newer transaction). This may cause starvation, so modifications like Wait-Die or Wound-Wait are used to handle starvation (just like in 2PL).


✅ Advantages


❌ Disadvantages


A few examples on Timestamp Ordering to cement the understanding.

🔍 Quick Recap: Timestamp Ordering Protocol (TO Protocol)


Rules (Basic Timestamp Ordering):

When a transaction T wants to Read(X) or Write(X):

  1. Read(X) by Ti:

    • If TS(Ti) < write_TS(X): ❌ Abort Ti (you’re trying to read something already overwritten by a younger transaction — violates consistency).

    • Else: ✅ Allow the read, and update read_TS(X) = max(read_TS(X), TS(Ti))

  2. Write(X) by Ti:

    • If TS(Ti) < read_TS(X) or TS(Ti) < write_TS(X):
      Abort Ti (you're trying to write something based on outdated read or overwritten value).

    • Else: ✅ Allow the write, and update write_TS(X) = TS(Ti)


🧠 Examples (Easy to Follow):

Let’s say:

Let’s say current state for data item A:


Example 1: T1 reads A, then T2 writes A

T1: Read(A)
T2: Write(A)
  1. T1 reads A:

    • 5 write_TS(A) (0) → ✅ Allow.
    • Set read_TS(A) = 5.
  2. T2 writes A:

    • 10 read_TS(A) (5) and 10 write_TS(A) (0) → ✅ Allow.
    • Set write_TS(A) = 10.

Both succeed.


Example 2: T2 reads A, then T1 writes A

T2: Read(A)
T1: Write(A)

Initial state: read_TS(A) = 0, write_TS(A) = 0

  1. T2 reads A:

    • 10 ≥ write_TS(A) (0) → ✅ Allow.
    • Update read_TS(A) = 10.
  2. T1 writes A:

    • 5 < read_TS(A) (10) → ❌ Abort T1

T1 is aborted. Because it’s older and tries to write something read by a younger transaction → violates timestamp order.


Example 3: T2 wants to write A but T1 already wrote A

T1: Write(A)
T2: Write(A)

Initial: write_TS(A) = 0

  1. T1 writes A:

    • 5 ≥ 0 → ✅ Allow.
    • Update write_TS(A) = 5.
  2. T2 writes A:

    • 10 ≥ read_TS(A) (assume still 0), 10 ≥ write_TS(A) (5) → ✅ Allow.
    • Update write_TS(A) = 10.

Both succeed.


Example 4: Younger T2 tries to read A after older T1 wrote A

T1: Write(A)
T2: Read(A)
  1. T1 writes A → ✅ write_TS(A) = 5

  2. T2 reads A:

    • 10 ≥ write_TS(A) (5) → ✅ Allowed.
    • read_TS(A) = 10

Both succeed.


Example 5: Younger transaction tries to write A, older already wrote A

T1: Write(A)
T2: Write(A)

Assume write_TS(A) = 5 from T1

Now T2 (TS = 4) tries to write A

You can't let a younger timestamp transaction overwrite data written by an older one.


🔁 Summary Table:

Case Outcome
Old reads, young writes ✅ Allowed
Young reads, old writes ✅ Allowed
Old writes, young reads ✅ Allowed
Young writes, old reads ❌ Abort
Young writes, old writes ❌ Abort if TS < write_TS
Old writes, young writes ✅ Allowed

Advantages of Pessimistic Concurrency Control


Disadvantages of Pessimistic Concurrency Control


Pessimistic Concurrency Control Methods (Basic gist)

1. Isolation Level

The isolation levels are defined as a degree to which the data residing in Database must be isolated by transactions for modification. Because, if some transactions are operating on some data let's say transaction - T1 & there comes another transaction - T2 and modifies it further while it was under operation by transaction T1 this will cause unwanted inconsistency problems.


2. Two-Phase Locking Protocol

We have already covered this before.


3. Distributed Lock Manager

A distributed lock a critical component in the distributed transaction system, which co-ordinates the lock acquiring, and releasing operations in the transactions. It helps in synchronizing the transaction and their operation so that data integrity is maintained.

Pasted image 20250523125817.png


4. Multiple Granularity Lock

A lock can be acquired at various granular level like: table level, row/record level, page level or any other resource's level. In transaction system a transaction can lock a whole table, or a specific row while performing some changes on it. This lock acquiring when done by various transactions simultaneously, this phenomena is called as multiple granularity locking.


Optimistic Concurrency Control (OCC)

The problem with pessimistic concurrency control systems is that, if a transaction acquires a lock on a resource so that no other transactions can access it. This will result in reducing concurrency of the overall system.

The Optimistic Concurrency control techniques proceeds on the basis of assumption that, 0 or very few transactions will try to access a certain resource simultaneously. We can describe a system as FULLY OPTIMISTIC, if it uses No-Locks at all & checks for conflicts at commit time. It has following 4-phases of operation:


Some OCC methods

1. Timestamp Ordering (OCC) version

💡 Core Idea:

Transactions execute without locking during their read/write phase.
Conflicts are only checked during the commit phase using timestamps.


🔄 Phases in OCC:

🔄 1. Read Phase

✍️ 2. Validation Phase (Key Phase for OCC)

3. Write Phase

🧠 Example of OCC with Timestamps

Let’s say:

T1 writes A → W(T1) =

T2 reads A → R(T2) =

Now, apply validation:


🔄 Key Difference vs Basic Timestamp Ordering

Feature Basic TO Protocol OCC with Timestamps
When conflicts are checked On every read/write Only during validation (before commit)
Abort timing During execution At commit time
Overhead Constant checks Buffered work + validation logic
Best suited for High conflict workloads Low conflict workloads

📋 Summary Table:

Phase What Happens
Read Transaction reads freely, stores changes locally
Validation Timestamp-based check against committed transactions
Write If valid, writes changes to DB; otherwise, aborts

✅ Key Points:


2. Multi-Version Concurrency Control (MVCC)

In MVCC, every data item has multiple versions of itself. When a transaction starts, it reads the version that is valid at the start of the transaction. And when the transaction writes, it creates a new version of that specific data item. That way, every transaction can concurrently perform their operations.

Each successful write results in the creation of a new version of the data item written. Timestamps are used to label the versions. When a read(X) operation is issued, select an appropriate version of X based on the timestamp of the transaction

Example: In a banking system two or more user can transfer money without blocking each other simultaneously.


Database Recovery

🛠️ What is Database Recovery?

Database recovery refers to the set of techniques used to restore a database to a correct state after a failure—whether it's due to a system crash, power loss, software bug, or disk error. The goal is to ensure data consistency, atomicity, and durability despite failures.


🔄 Why is it Important?

Databases follow the ACID properties (Atomicity, Consistency, Isolation, Durability). Recovery ensures:


💾 Recovery Concepts:

  1. Logs (Write-Ahead Logging - WAL):

    • Every change is first written to a log file before being applied to the database.
    • These logs are used to redo committed transactions and undo uncommitted ones during recovery.
  2. Checkpoints:

    • Periodically, the database saves a snapshot of the current state to avoid redoing the entire log from the beginning.
    • Speeds up recovery by telling the system: "You can start recovering from here."

🔃 Recovery Steps (Simplified):

On system restart after crash:


🧠 Techniques Used:


✅ Summary:

Database recovery is all about maintaining correctness after failure. It uses logs, checkpoints, and rollback/redo mechanisms to bring the system back to a consistent state, ensuring no data is lost or wrongly retained.


Immediate vs Deferred Update in Database Recovery

These two approaches define when the database gets physically updated during a transaction's execution.

🟢 Deferred Update (Postponed Update)

🔹 Concept:

🔹 Key Points:


🔹 Example:

Imagine a transaction T1 wants to set A = A + 100.

🔹 Pros:

🔹 Cons:


🔴 Immediate Update

🔹 Concept:


🔹 Key Points:


🔹 Example:

For the same T1A = A + 100:


🔹 Pros:


🔹 Cons:


⚖️ Summary Table:

Feature Deferred Update Immediate Update
When DB is updated After commit During execution
Needs UNDO ❌ No ✅ Yes
Needs REDO ✅ Yes ✅ Yes
Recovery Complexity Simple Complex
Speed Slower (delayed effect) Faster (instant update)
Example Usage Batch systems Real-time transaction systems