Distributed Query Optimization -- Distributed Systems -- Module 3


Index

  1. Query Processing
  2. Objectives of Query Processing
  3. Characteristics of a Query Processor
  4. Phases of Distributed Query Processing
  5. Query Decomposition
  6. Query Optimization
  7. Factors affecting Query Optimization
  8. Query Optimization in Centralized Systems
  9. Ordering of Fragment Queries
  10. Taxonomy of transaction models
  11. Distributed Query Optimization Algorithms
  12. βœ… 1. Greedy Heuristic Algorithm
  13. βœ… 2. Dynamic Programming Algorithm (Selinger’s Algorithm)
  14. Transaction Management
  15. Concurrency control in centralized database systems
  16. Distributed Concurrency Control Algorithms
  17. 1. Distributed Two-Phase Locking (D2PL)
  18. 2. Distributed Timestamp Ordering (DTO)
  19. 🧾 Distributed Commit Protocols (Commit Algorithms)
  20. Deadlock Management
  21. Detecting deadlocks using Wait-For graphs -- examples

Continuation of Module 2 topics.

Query Processing

What is Query Processing?

Query processing is the process of taking a user's high-level query (usually written in SQL), interpreting it, and executing it to retrieve the required data from a distributed database system β€” where data is stored across multiple locations (sites).

It involves several steps like:

Objectives of Query Processing

1. Correctness


2. Efficiency


3. Optimization


4. Transparency


5. Scalability


Characteristics of a Query Processor

1. Languages


2. Types of Optimization

Multiple strategies can be used for executing a query.

Two main approaches:


3. Statistics

Used for better query optimization:

The processor uses statistical information to make better decisions. This includes:


4. Decision Sites


5. Exploitation of Network Topology


6. Exploitation of Replicated Fragments

When data is replicated at multiple sites, the processor can:


Phases of Distributed Query Processing

https://www.ques10.com/p/21460/explain-the-phases-of-query-processing-in-distribu/

https://www.youtube.com/watch?v=Y7SamqOKx1E

Pasted image 20250522135604.png

Pasted image 20250522135515.png

1. Query Decomposition

The first layer decomposes the calculus query into an algebraic query on global relations. The information needed for this transformation is found in the global conceptual schema describing the global relations.

Query decomposition can be viewed as four successive steps.

Example:

We have a query:

SELECT salary FROM instructor WHERE salary < 75000;

This query can be translated into either of the following relational algebra expressions

or


2. Data Localization

The main role of the second layer is to localize the query's data using data distribution information in the fragment schema.

This layer determines which fragments are involved in the query and transforms the distributed query into a query on fragments.

Generating a query on fragments is done in two steps


3. Global Query Optimization

The goal of query optimization is to find an execution strategy for the query which is close to optimal.


4. Local Optimization

βœ… Goal: Make each local operation as efficient as possible.


5. Query Execution


Query Decomposition

Already explained in Query Decomposition


Data Localization

In a distributed database, data is not stored in one place β€” it's fragmented and distributed across multiple sites or servers. After query decomposition (which gives us a logical query), data localization transforms this logical query into subqueries that correspond to actual data fragments located across different sites.

Why Is This Step Needed?

So we need to rewrite the query to target the correct fragments.


Main Tasks in Data Localization

1. Identify Relevant Fragments

Example:

SELECT name FROM Employees WHERE region = 'EU';Β 

If Employees is horizontally fragmented:

The query is rewritten to:

SELECT name FROM Employees_EU WHERE region = 'EU';Β 

2. Generate Subqueries for Each Site


3. Handle Data Replication


Types of Fragmentation and How They Affect Localization

Fragmentation Type Description Localization Behavior
Horizontal Each fragment holds a subset of rows. Query sent only to relevant fragments (based on WHERE clause).
Vertical Each fragment holds a subset of columns. Requires joining fragments using a common key.
Hybrid Combination of both. Complex mapping; needs rewriting logic based on both row and column placement.

Module 3 Topics start here

Query Optimization

Factors affecting Query Optimization

1. Data Distribution


2. Communication Cost

Example: Sending a large table over the network is slow, so it's better to do filtering (WHERE conditions) first at the site where the data is stored.


3. Local Processing Cost

This is the cost of processing data at each individual site.Β 
It includes:


4. Query Structure


5. Replication of Data


6. Differences in Site Capabilities


7. Current System Load


8. Execution Strategy


9. Availability of Statistics

The optimizer uses information (called metadata) like:


10. Optimization Method Used


Query Optimization in Centralized Systems

In a centralized system, query processing is done with the following aim:


Query Parsing and Translation

Initially, the SQL query is scanned. Then it is parsed to look for syntactical errors and correctness of data types. If the query passes this step, the query is decomposed into smaller query blocks. Each block is then translated to equivalent relational algebra expression.


Steps for Query Optimization

Query optimization involves three steps, namely query tree generation, plan generation, and query plan code generation.

Step 1: Query Tree Generation

A query tree is a tree data structure representing a relational algebra expression. The tables of the query are represented as leaf nodes. The relational algebra operations are represented as the internal nodes. The root represents the query as a whole.

During execution, an internal node is executed whenever its operand tables are available. The node is then replaced by the result table. This process continues for all internal nodes until the root node is executed and replaced by the result table.

Example 1

For example, let's say we have the following schemas:

EMPLOYEE

EmpID EName Salary DeptNo DateOfJoining

DEPARTMENT

DNo DName Location

And we have this query:

Ο€EmpID(ΟƒEName="ArunKumar"(EMPLOYEE))

The corresponding query tree will be:

Pasted image 20250522143849.png


Example 2

Let us consider another query involving a join.

Ο€EName,Salary(ΟƒDName="Marketing"(DEPARTMENT)) β‹ˆDNo=DeptNo(EMPLOYEE)

So the query tree will be like this:

Pasted image 20250522144124.png

Note how the the joining point of the two branches of this tree is β‹ˆDNoΒ =Β DeptNo where the β‹ˆ pronounced as "bowtie" operator is used to denote the natural join of two relations.


Step 2: Query Plan Generation

After the query tree is generated, a query plan is made. A query plan is an extended query tree that includes access paths for all operations in the query tree. Access paths specify how the relational operations in the tree should be performed. For example, a selection operation can have an access path that gives details about the use of B+ tree index for selection.

Besides, a query plan also states how the intermediate tables should be passed from one operator to the next, how temporary tables should be used and how operations should be pipelined/combined.


Step 3: Code Generation

Code generation is the final step in query optimization. It is the executable form of the query, whose form depends upon the type of the underlying operating system. Once the query code is generated, the Execution Manager runs it and produces the results.


Approaches to Query Optimization

Among the approaches for query optimization, exhaustive search and heuristics-based algorithms are mostly used.

Exhaustive Search Optimization

In these techniques, for a query, all possible query plans are initially generated and then the best plan is selected. Though these techniques provide the best solution, it has an exponential time and space complexity owing to the large solution space. For example, dynamic programming technique.

Heuristic Based Optimization

Heuristic based optimization uses rule-based optimization approaches for query optimization. These algorithms have polynomial time and space complexity, which is lower than the exponential complexity of exhaustive search-based algorithms. However, these algorithms do not necessarily produce the best query plan.


Ordering of Fragment Queries

πŸ” What is it?

In a distributed database, data is fragmented and stored across multiple sites. When a user sends a query, the system needs to:

  1. Identify which fragments are relevant.
  2. Determine how to retrieve and combine those fragments efficiently.
    This is known as ordering fragment queries or query decomposition and optimization in DDBS.

πŸ”„ What Happens During Fragment Query Ordering?

In a distributed database, data is typically fragmented to improve performance, availability, and manageability. These fragments can be:

When a user sends a high-level query, the system must:

  1. Decompose the query into sub-queries that target specific fragments.
  2. Optimize how those sub-queries are executed across distributed sites.
  3. Assemble the results efficiently.

🧭 Steps in Fragment Query Ordering

Let’s break this down with an example query and walk through what the system does:

❓ Example Query:

SELECT name 
FROM Employee 
WHERE department = 'Sales' AND salary > 50000;

🧩 Suppose:


🧱 Step-by-Step Breakdown

1. Query Decomposition


2. Query Localization


3. Query Optimization and Ordering


4. Result Assembly


βš™οΈ Factors That Affect the Ordering

Factor Description
Data location Where each fragment resides affects transfer costs.
Size of fragments Querying larger fragments first may delay final result.
Selectivity of conditions Fragments likely to return fewer rows may be prioritized.
Network bandwidth Sites with faster connections might be queried first.
Parallel capabilities If sites can be queried in parallel, ordering might be less critical.

πŸ’‘ Optimization Strategy Examples

Greedy Heuristic:

Choose the fragment with the lowest estimated cost first, then continue optimizing remaining parts based on intermediate results.

Cost-Based Optimization:

Estimate communication + computation costs of all possible query plans, and choose the most efficient one.

Semi join Strategy (important for DDBS):

Instead of pulling full tables to one site for join, do a semi join to reduce data movement.


πŸ” Real-World Analogy

Think of trying to find all salespeople in a company. Your records are split across offices in two cities. You can either:

How you choose depends on:


Distributed Query Optimization Algorithms

In a distributed database, optimizing a query is more complex than in centralized DBMS because of:

So the goal is to find the most cost-effective plan for query execution across sites.


🧠 Static vs Dynamic Optimization

1. Static Optimization (Compile-Time)

βœ”οΈ Pros:

❌ Cons:


2. Dynamic Optimization (Run-Time)


βœ”οΈ Pros:


❌ Cons:

πŸ”Ž In exams, static optimization is usually the default unless specifically stated otherwise.


βš™οΈ Key Optimization Algorithms You Should Know

Let’s cover a core set of algorithms β€” enough for your syllabus without overwhelming you.

βœ… 1. Greedy Heuristic Algorithm

πŸ”Ή Concept:


πŸ”Ή Simple Example:

Suppose you have 3 relations:

And the query is:

SELECT * FROM R, S, T WHERE R.b = S.b AND S.c = T.c;

Let’s assume we know the estimated join sizes:

Join Estimated Result Size
R β‹ˆ S 800
S β‹ˆ T 200
R β‹ˆ T Invalid (no direct join)
βœ… Greedy Steps:
  1. Pick smallest join cost first β†’ S β‹ˆ T = 200
  2. Next, join result with R β†’ (S β‹ˆ T) β‹ˆ R = ??

Let’s say that yields 900 final tuples.
Done! That’s your plan:

Plan: (S β‹ˆ T) β‹ˆ R

βœ”οΈ Suitable For:


βœ… 2. Dynamic Programming Algorithm (Selinger’s Algorithm)

🧩 Idea:

βœ”οΈ Suitable For:


πŸ”Ή Simple Example (Same query as before)

Relations: R, S, T
Goal: Evaluate R β‹ˆ S β‹ˆ T

πŸ”Έ Step 1: Compute best plans for pairs
Pair Join Cost
R β‹ˆ S 800
S β‹ˆ T 200
Not allowed ❌

Best plans:

πŸ”Έ Step 2: Combine into 3-way join
Set Best Subplan Cost
R, S, T (S β‹ˆ T) β‹ˆ R 900
(R β‹ˆ S) β‹ˆ T 1100

Pick minimum β†’ βœ… Final plan = (S β‹ˆ T) β‹ˆ R


πŸ’‘ Why it’s useful:

Can be slow for many joins, so real DBMS limit the search space using heuristics.

πŸ’‘ You can mention Selinger’s approach as a classical cost-based optimizer. Very common in university exams.


βœ… 3. Semi-join-Based Optimization

πŸ”Ή Concept:

In distributed systems, the goal is to reduce data transfer.
Instead of sending whole tables across the network, use semijoins to filter only relevant data.


πŸ”Ή What is a Semi-join?

A semi-join R ⋉ S returns only those tuples of R that match with S, but doesn’t include S’s columns.


πŸ”Ή Simple Distributed Example:

Assume:

SELECT eid, dname FROM Employee E, Department D WHERE E.did = D.did;

If we transfer both tables: high cost (especially if Employee is huge)

βœ… Semi-join Strategy:
  1. Step 1 (Reduce Data Transfer):

    • Project join attribute: Ο€did(Employee) β€” send this from Site A to Site B
  2. Step 2 (Filter Department):

    • At Site B: do Ο€did(Employee) ⋉ Department β€” select only departments needed
  3. Step 3 (Send Back Filtered Departments):

    • Send filtered Department table back to Site A
  4. Step 4 (Join locally at Site A):

    • Final join with original Employee table

πŸ’‘ Why it’s useful:


βœ… 4. Two-Phase Optimization

🧩 Idea:

βœ”οΈ Pros:


βœ… 5. Iterative Improvement / Genetic Algorithms (optional)


πŸ“š Summary for Exam-Level Understanding:

Algorithm Type Notes
Greedy Heuristic Static Simple, fast, suboptimal plans
Dynamic Programming (Selinger) Static Best plan, expensive for many joins
Semijoin Strategy Static/Dynamic DDBS-specific, reduces data transfer
Two-Phase Optimization Static Modular; separates local and global
Dynamic Optimization Dynamic Adaptive, less common in basic DBMS
(Genetic, etc.) Optional Mention if asked for advanced types

Transaction Management

What is a transaction?

https://www.youtube.com/watch?v=t5hsV9lC1rU&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=74

(Content added from DBMS module 4)

1. Definition of a Transaction

This sequence of operations forms one transaction.


2. Core Operations in a Database Transaction


3. Why Transactions Are Important in DBMS


4. Summarizing the Process of a Transaction

  1. Read Data:
    The necessary data is read from the hard disk into RAM.

  2. Perform Operations:
    The transaction performs its logical unit of work (e.g., deducting from one account, adding to another). These operations are processed in RAM for speed.

  3. Write Data:
    The updated values are stored in RAM as the transaction is processed.

  4. Commit:
    Finally, the commit operation saves the changes permanently to the hard disk.

  5. (Optional) Rollback:
    If any error occurs during the process, a rollback can revert the changes to maintain data consistency.


Goals of Transaction Management

Transaction management in a database system ensures that the database remains in a consistent, reliable, and recoverable state, even in the presence of system failures or concurrent access by multiple users. The primary goals of transaction management include:

1. Maintaining Database Consistency

2. Ensuring Atomicity

3. Guaranteeing Isolation

4. Providing Durability

5. Supporting Recovery from Failures

6. Handling Concurrent Access Efficiently


Characteristics of transactions

ACID properties of Transaction

https://www.youtube.com/watch?v=-GS0OxFJsYQ&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=75

1. Atomicity


2. Consistency


3. Isolation


4. Durability


Conclusion

The ACID properties ensure that database transactions are processed reliably and safely:

These principles are fundamental in DBMS design and are crucial for systems like banking applications, where accuracy and reliability of transactions are paramount.


States of a transaction

https://www.youtube.com/watch?v=ObwYFVLB_VI&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=76

Pasted image 20250315142406.png

1. Transaction States Overview


2. Active State


3. Partially Committed State


4. Commit State


5. Termination (Deallocation)


6. Failed (Abort) State and Rollback


Conclusion


Taxonomy of transaction models

https://www.geeksforgeeks.org/flat-nested-distributed-transactions/

Here are some terms we should know before understanding this:

And some transaction commands:

Various roles are allocated to running a transaction successfully :

Why a taxonomy?
Traditional DBMS transactions follow strict ACID properties. However, in modern and distributed systems (like cloud services, mobile apps, or collaborative workflows), these properties can sometimes become too rigid. Hence, new transaction models have emerged, each balancing consistency, availability, and performance differently.


1. Flat Transactions (Traditional Model)

Pasted image 20250523000901.png

A client makes requests to multiple servers in a flat transaction. Transaction T, for example, is a flat transaction that performs operations on objects in servers X, Y, and Z.

Before moving on to the next request, a flat client transaction completes the previous one. As a result, each transaction visits the server object in order.Β 
A transaction can only wait for one object at a time when servers utilize locking.

Limitations of a flat Transaction :


2. Nested Transactions

Pasted image 20250523001036.png

Consider a distributed transaction (T) Β in which a customer transfers :

Assuming :

  1. Account A is on server X
  2. Account B is on server Y,and
  3. Accounts C and D are on server Z.

The transaction T involves four requests - 2 for deposits and 2 for withdrawals. Now they can be treated as sub transactions (T1, T2, T3, T4) of the transaction T.

As shown in the figure below, transaction T is designed as a set of four nested transactions : T1, T2, T3 and T4.

Pasted image 20250523001148.png

//Start the  Transaction
T = open transaction

//T1
openSubtransaction
a.withdraw(105);


//T2
openSubtransaction
b.withdraw(205);


//T3
openSubtransaction
c.deposit(105);


//T4
openSubtransaction
d.deposit(205);


//End the Transaction
close Transaction

3. Distributed Transactions


4. Real-Time Transactions


5. Mobile Transactions


6. Temporal Transactions


Summary Table

Model Use Case Key Feature
Flat Simple, short operations All-or-nothing ACID
Nested Modular tasks Sub-transaction hierarchy
Distributed Across multiple DBs Two/Three-phase commit
Real-Time Embedded systems Time-bound constraints
Mobile Offline mobile apps Deferred and sync-based
Temporal Time-sensitive data Tracks changes over time

Concurrency control

(DBMS module 4 recap)

To skip to DS-oriented topics click here: Concurrency control in centralized database systems

1. Shared-Exclusive Locking Protocol

https://www.youtube.com/watch?v=94C0V7f2zm4&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=87

This is one of the basic concurrency control mechanisms in databases. Its main purpose is to:

🧠 Two Types of Locks:

  1. Shared Lock (S-lock):

    • Allows only reading.
    • Multiple transactions can hold a shared lock on the same data at the same time.
    • No changes can be made to the data.
  2. Exclusive Lock (X-lock):

    • Allows both reading and writing.
    • Only one transaction can hold an exclusive lock on a data item.
    • No other transaction can access the data in any form while this lock is held.

When to Use Which?

Examples:


Compatibility Table

This table determines whether a new lock request is allowed based on the currently held lock on the data item.

Currently Held (Grant) Requested Lock Allowed?
Shared Shared βœ… Yes
Shared Exclusive ❌ No
Exclusive Shared ❌ No
Exclusive Exclusive ❌ No

πŸ”Ž Why?


Lock Lifecycle

For each transaction:

  1. Lock the required data item(s) using S or X.
  2. Perform operations (read/write).
  3. Unlock when done.

This is typically implemented with the help of a lock manager inside the database that follows the compatibility rules.


Goal of the protocol


Drawbacks of Shared-Exclusive Locking

https://www.youtube.com/watch?v=UsqtDD1VriY&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=88

πŸ”’ Shared-Exclusive Locking: Quick Recap


1. May Not Guarantee Serializability

"May not be sufficient to produce only serializable schedules."


2. May Not Ensure Recoverability (Irrecoverable Schedules Possible)

"May not be free from irrecoverability."


3. Possibility of Deadlock

"May not be free from deadlock."


4. Possibility of Starvation

"May not be free from starvation."


Summary of All Key Drawbacks

Drawback Description
❌ Not always serializable Locking doesn't guarantee serializable schedules
❌ Irrecoverable schedules Transactions may commit after reading dirty data, breaking consistency
⚠️ Deadlock possible Circular waiting on locks causes indefinite blocking
⚠️ Starvation possible Some transactions may never get the lock if others keep acquiring it

2. 2-Phase Locking (2PL)

https://www.youtube.com/watch?v=1pUaEDNLWi4&list=PLxCzCOWd7aiFAN6I8CuViBuCdJgiOkT2Y&index=89

In simpler locking protocols (like Shared & Exclusive locks), inconsistencies or non-serializable schedules may occur due to poor coordination. 2PL adds structure to how and when locks are acquired or released during a transaction to avoid such problems.


🧩 2 Phases in 2PL

  1. πŸ”Ό Growing Phase

    • You can acquire (take) locks (Shared or Exclusive).
    • You cannot release any locks.
    • Think of this as the β€œgather all resources you need” phase.
  2. πŸ”½ Shrinking Phase

    • You release locks.
    • You are not allowed to acquire any new locks.
    • Once a transaction releases its first lock, it enters this phase, and cannot go back.

πŸ” The transition from growing to shrinking happens the moment a transaction releases any lock.


βœ… How Does 2PL Achieve Serializability?

Let’s take an example:

In 2PL:

This creates a natural serial order β€” since T2 had to wait, it can be considered to have happened after T1.

So:
⏳ Order of lock acquisition determines order of transaction execution, even if they're overlapping in real time.


🧠 Key Concepts You Must Remember

Term Explanation
Growing Phase Only lock acquisition allowed
Shrinking Phase Only lock release allowed
Lock-Point The point where a transaction acquires its last lock (or equivalently, the point where it releases its first lock) β€” helps determine the serialization order
Compatibility Table Dictates whether two locks can coexist. E.g. Shared-Shared = βœ…, Shared-Exclusive = ❌

βš™ Example Breakdown

Let’s say:

What happens?


πŸ›‘ Common Misconceptions Cleared


Drawbacks of 2PL

1. Irrecoverable Schedules

πŸ” What it means:

A schedule is recoverable if a transaction T2 that reads data written by T1 commits only after T1 commits.

❌ Problem:

In 2PL, there is no rule that enforces this ordering of commits. So:

You cannot roll back T2 even though it read uncommitted/dirty data. This leads to an irrecoverable schedule, which violates atomicity.


2. Cascading Rollbacks

πŸ” What it means:

A cascading rollback happens when a failure in one transaction forces multiple other transactions to roll back, due to them having read its data.

❌ Problem:

2PL allows this to happen:

This leads to a chain of rollbacks, which hurts performance and consistency.

πŸ’‘ Even though 2PL controls how locks are acquired/released, it doesn’t stop reads from uncommitted writes, which causes the issue.


3. Deadlocks

πŸ” What it means:

A deadlock occurs when two or more transactions wait for each other indefinitely to release locks.

❌ Problem:

2PL doesn’t avoid deadlocks. For example:

Now, both transactions are stuckβ€”circular waitingβ€”and neither can proceed.

Deadlocks are a classic problem in locking systems and need separate detection and recovery mechanisms, like:


4. Starvation

πŸ” What it means:

Starvation happens when a transaction never gets a chance to execute because other transactions keep overtaking it.

❌ Problem:

Under 2PL, a transaction can be perpetually delayed in acquiring a lock if:

This leads to starvation of long-waiting transactions.


πŸ”„ Summary of Drawbacks:

Issue Possible in 2PL? Why?
Irrecoverability βœ… Yes No commit-order enforcement
Cascading Rollbacks βœ… Yes No control on dirty reads
Deadlocks βœ… Yes No deadlock avoidance
Starvation βœ… Yes No fairness policy

βœ… How to fix them?

To address these problems, stronger protocols can be used, such as:

  1. Strict 2PL:

    • All exclusive locks are held until commit/abort
    • Prevents cascading rollbacks
    • Guarantees recoverable and cascade-less schedules
  2. Rigorous 2PL:

    • All locks (shared and exclusive) are held till commit
    • Ensures both recoverability and serializability
  3. Timestamp Ordering / Optimistic Concurrency:

    • To avoid deadlocks entirely (no locks involved)

So this means that, what shared and exclusive locks did was give us two locks:

Enter 2-Phase locking:


Strict 2PL and Rigorous 2PL

Both these protocols build on the basics 2PL protocol and add their own rules on top.

πŸ” Strict 2-Phase Locking (Strict 2PL)

🧱 Base Foundation:

πŸ“Œ Additional Rule in Strict 2PL:


πŸ›  What Problems It Solves:

  1. Cascading Rollback:

    • If another transaction reads a value written by an uncommitted transaction, and the writer aborts β†’ the reader must also roll back (cascading).
    • Strict 2PL prevents this by not allowing reads on uncommitted data.
  2. Irrecoverability:

    • A committed transaction reading from a transaction that later aborts β†’ makes rollback impossible.
    • Prevented because reads can't happen until the writer commits.

🧠 Insight:


🧱 Rigorous 2-Phase Locking (Rigorous 2PL)

πŸ”’ Even More Restrictive:

πŸ’‘ Key Difference from Strict 2PL:


βœ… What It Guarantees:


πŸ“Š Summary of Comparison:

Property Basic 2PL Strict 2PL Rigorous 2PL
Follows Growing/Shrinking? βœ… Yes βœ… Yes βœ… Yes
Holds exclusive locks till commit? ❌ No βœ… Yes βœ… Yes
Holds shared locks till commit? ❌ No ❌ No βœ… Yes
Prevents cascading rollback? ❌ No βœ… Yes βœ… Yes
Prevents irrecoverability? ❌ No βœ… Yes βœ… Yes
More strict? Least strict Moderate Most strict

🧱 What They Don’t Solve:


Timestamp Ordering Protocol

What is the Timestamp Ordering Protocol?

Timestamp Ordering Protocol is a concurrency control method that ensures serializability by ordering transactions based on their timestamps.


πŸ•’ Three Important Timestamps

For every data item (say A), we track:

  1. TS(Ti): Timestamp of transaction Ti.
  2. RTS(A): Read Timestamp of A β†’ TS of the most recent transaction that successfully read A.
  3. WTS(A): Write Timestamp of A β†’ TS of the most recent transaction that successfully wrote A.

🧠 Basic Principle

"Older transactions should not be affected by younger ones."

If a younger transaction tries to read/write something modified by an older transaction, it might be aborted to preserve the correct order of operations.


πŸ” Rules of Timestamp Ordering Protocol

Let’s say transaction T wants to perform Read(A) or Write(A).

1. Read Rule

Transaction T with timestamp TS(T) wants to read A.

Why?
Because T is older, but A was written by a younger transaction, which violates the order.

Example:


2. Write Rule

Transaction T wants to write A.

Example:

Because someone already read A (T1), and if T2 writes now, it will create a non-serializable schedule.


βš–οΈ Summary Table

Operation Condition Action
Read(A) by T TS(T) β‰₯ WTS(A) βœ… Allow
TS(T) < WTS(A) ❌ Abort T
Write(A) by T TS(T) β‰₯ RTS(A) and TS(T) β‰₯ WTS(A) βœ… Allow
TS(T) < RTS(A) or TS(T) < WTS(A) ❌ Abort T

πŸ” What Happens on Abort?

When a transaction is aborted, it's restarted with a new timestamp (i.e., treated as a newer transaction). This may cause starvation, so modifications like Wait-Die or Wound-Wait are used to handle starvation (just like in 2PL).


βœ… Advantages


❌ Disadvantages


A few examples on Timestamp Ordering to cement the understanding.

πŸ” Quick Recap: Timestamp Ordering Protocol (TO Protocol)


βœ… Rules (Basic Timestamp Ordering):

When a transaction T wants to Read(X) or Write(X):

  1. Read(X) by Ti:

    • If TS(Ti) < write_TS(X): ❌ Abort Ti (you’re trying to read something already overwritten by a younger transaction β€” violates consistency).

    • Else: βœ… Allow the read, and update read_TS(X) = max(read_TS(X), TS(Ti))

  2. Write(X) by Ti:

    • If TS(Ti) < read_TS(X) or TS(Ti) < write_TS(X):
      ❌ Abort Ti (you're trying to write something based on outdated read or overwritten value).

    • Else: βœ… Allow the write, and update write_TS(X) = TS(Ti)


🧠 Examples (Easy to Follow):

Let’s say:

Let’s say current state for data item A:


Example 1: T1 reads A, then T2 writes A

T1: Read(A)
T2: Write(A)
  1. T1 reads A:

    • 5 β‰₯ write_TS(A) (0) β†’ βœ… Allow.
    • Set read_TS(A) = 5.
  2. T2 writes A:

    • 10 β‰₯ read_TS(A) (5) and 10 β‰₯ write_TS(A) (0) β†’ βœ… Allow.
    • Set write_TS(A) = 10.

βœ… Both succeed.


Example 2: T2 reads A, then T1 writes A

T2: Read(A)
T1: Write(A)

Initial state: read_TS(A) = 0, write_TS(A) = 0

  1. T2 reads A:

    • 10 β‰₯ write_TS(A) (0) β†’ βœ… Allow.
    • Update read_TS(A) = 10.
  2. T1 writes A:

    • 5 < read_TS(A) (10) β†’ ❌ Abort T1

❌ T1 is aborted. Because it’s older and tries to write something read by a younger transaction β†’ violates timestamp order.


Example 3: T2 wants to write A but T1 already wrote A

T1: Write(A)
T2: Write(A)

Initial: write_TS(A) = 0

  1. T1 writes A:

    • 5 β‰₯ 0 β†’ βœ… Allow.
    • Update write_TS(A) = 5.
  2. T2 writes A:

    • 10 β‰₯ read_TS(A) (assume still 0), 10 β‰₯ write_TS(A) (5) β†’ βœ… Allow.
    • Update write_TS(A) = 10.

βœ… Both succeed.


Example 4: Younger T2 tries to read A after older T1 wrote A

T1: Write(A)
T2: Read(A)
  1. T1 writes A β†’ βœ… write_TS(A) = 5

  2. T2 reads A:

    • 10 β‰₯ write_TS(A) (5) β†’ βœ… Allowed.
    • read_TS(A) = 10

βœ… Both succeed.


Example 5: Younger transaction tries to write A, older already wrote A

T1: Write(A)
T2: Write(A)

Assume write_TS(A) = 5 from T1

Now T2 (TS = 4) tries to write A

You can't let a younger timestamp transaction overwrite data written by an older one.


πŸ” Summary Table:

Case Outcome
Old reads, young writes βœ… Allowed
Young reads, old writes βœ… Allowed
Old writes, young reads βœ… Allowed
Young writes, old reads ❌ Abort
Young writes, old writes ❌ Abort if TS < write_TS
Old writes, young writes βœ… Allowed

Concurrency control in centralized database systems

You can study the following part in either DBMS module 4 or here.

Pasted image 20250523132946.png

https://www.geeksforgeeks.org/concurrency-control-in-distributed-transactions/

Pessimistic Concurrency Control (PCC) -- Basic overview

TheΒ Pessimistic Concurrency Control MechanismsΒ proceeds on assumption that, most of theΒ transactionsΒ will try to access the same resource simultaneously. It's basically used to prevent concurrent access to a shared resource and provide a system of acquiring a Lock on the data item before performing any operation.

Algorithms that fall under this protocol:

Advantages of Pessimistic Concurrency Control

Disadvantages of Pessimistic Concurrency Control

Pessimistic Concurrency Control Methods (Basic gist)

1. Isolation Level

TheΒ isolation levelsΒ are defined as a degree to which the data residing inΒ DatabaseΒ must be isolated by transactions for modification. Because, if some transactions are operating on some data let's say transaction - T1 & there comes another transaction - T2 and modifies it further while it was under operation by transaction T1 this will cause unwanted inconsistency problems.


2. Two-Phase Locking Protocol

We have already covered this before.


3. Distributed Lock Manager

A distributed lock a critical component in the distributed transaction system, which co-ordinates the lock acquiring, and releasing operations in the transactions. It helps in synchronizing the transaction and their operation so that data integrity is maintained.

Pasted image 20250523125817.png


4. Multiple Granularity Lock

A lock can be acquired at various granular level like: table level, row/record level, page level or any other resource's level. In transaction system a transaction can lock a whole table, or a specific row while performing some changes on it. This lock acquiring when done by various transactions simultaneously, this phenomena is called asΒ multiple granularity locking.


Optimistic Concurrency Control (OCC)

The problem with pessimistic concurrency control systems is that, if a transaction acquires a lock on a resource so that no other transactions can access it. This will result in reducing concurrency of the overall system.

TheΒ Optimistic Concurrency control techniquesΒ proceeds on the basis of assumption that, 0 or very few transactions will try to access a certain resource simultaneously. We can describe a system asΒ FULLY OPTIMISTIC, if it usesΒ No-LocksΒ at all & checks for conflicts at commit time. It has following 4-phases of operation:

Some OCC methods

1. Timestamp Ordering (OCC) version

πŸ’‘ Core Idea:

Transactions execute without locking during their read/write phase.
Conflicts are only checked during the commit phase using timestamps.


πŸ”„ Phases in OCC:

πŸ”„ 1. Read Phase

✍️ 2. Validation Phase (Key Phase for OCC)

βœ… 3. Write Phase

🧠 Example of OCC with Timestamps

Let’s say:

T1 writes A β†’ W(T1) =

T2 reads A β†’ R(T2) =

Now, apply validation:


πŸ”„ Key Difference vs Basic Timestamp Ordering

Feature Basic TO Protocol OCC with Timestamps
When conflicts are checked On every read/write Only during validation (before commit)
Abort timing During execution At commit time
Overhead Constant checks Buffered work + validation logic
Best suited for High conflict workloads Low conflict workloads

πŸ“‹ Summary Table:

Phase What Happens
Read Transaction reads freely, stores changes locally
Validation Timestamp-based check against committed transactions
Write If valid, writes changes to DB; otherwise, aborts

βœ… Key Points:


2. Multi-Version Concurrency Control (MVCC)

InΒ MVCC, every data item has multiple versions of itself. When a transaction starts, it reads the version that is valid at the start of the transaction. And when the transaction writes, it creates a new version of that specific data item. That way, every transaction can concurrently perform their operations.

Each successful write results in the creation of a new version of the data item written. Timestamps are used to label the versions. When a read(X) operation is issued, select an appropriate version of X based on the timestamp of the transaction

Example: In a banking system two or more user can transfer money without blocking each other simultaneously.


Distributed Concurrency Control Algorithms

1. Distributed Two-Phase Locking (D2PL)

πŸ’‘ What is it?

A distributed extension of the basic 2PL protocol, where:


πŸ› οΈ How it works

Just like regular 2PL, each transaction has:

In D2PL, this protocol is enforced across all sites involved in the transaction.


πŸ“Œ Key Differences from centralized 2PL

Centralized 2PL Distributed 2PL
One lock manager Multiple lock managers (one per site)
Lock requests are local Lock requests go to remote sites
Deadlocks are local Deadlocks can happen across sites (global deadlocks)
Easier to track lock phase Must coordinate globally to detect growing/shrinking phase end

🧠 Key Challenges


πŸ” Example

Imagine:

Steps:

  1. T1 requests shared-lock on X at Site A β†’ Site A's lock manager grants it.
  2. T1 requests exclusive-lock on Y at Site B β†’ Site B's lock manager grants it.
  3. T1 finishes growing phase and begins to commit (shrinking phase).
  4. T1 releases both locks β€” Site A and B must both record the lock release.

Why it's tricky:

This is a distributed deadlock β€” not visible at any single site!


πŸ›‘οΈ How to Handle Distributed Deadlocks


βœ… Summary


2. Distributed Timestamp Ordering (DTO)

πŸ’‘ What is it?

A distributed version of timestamp ordering (TO):


πŸ“Œ Key Assumptions


πŸ”„ How It Works (at each site)

For a data item X:

Now:

Operation Rule (simplified)
Tα΅’ reads X Allowed if TS(Tα΅’) β‰₯ write_TS(X)
If TS(Tα΅’) < write_TS(X), abort Tα΅’
Tα΅’ writes X Allowed if TS(Tα΅’) β‰₯ read_TS(X) and TS(Tα΅’) β‰₯ write_TS(X)
Otherwise, abort Tα΅’

πŸ” Example

Assume:

Scenario:

  1. T1 wants to write X at Site A β†’ OK if read_TS(X) ≀ 5, write_TS(X) ≀ 5

  2. T2 later wants to read X at Site A:

    • If T1's write happened, now write_TS(X) = 5
    • Since TS(T2) = 10 β‰₯ 5, read is allowed

Now suppose:

  1. T2 writes Y at Site B β†’ succeeds

  2. Later, T1 wants to read Y at Site B:

    • But write_TS(Y) = 10 (from T2)
    • TS(T1) = 5 < 10 β†’ T1 is aborted!

So, DTO prevents T1 from reading a value written by a "future" transaction, ensuring serial order matches timestamps.


🚨 Key Issues in Distributed Setup

Challenge Description
Clock Sync Sites must agree on a global timestamp order (e.g., Lamport or vector clocks)
Abort Propagation A transaction must be aborted at all sites if violated at any one site
Communication Delay Timestamps must be sent along with each read/write request
Cascading Aborts If one transaction is aborted, dependent ones may need to be aborted too

πŸ› οΈ How It Maintains Serializability

DTO guarantees that the conflict-serializable schedule is equivalent to a serial schedule ordered by transaction timestamps.

This avoids the need for locking β€” great for read-heavy systems β€” but sacrifices some throughput due to more aborts.


βœ… Summary


🧾 Distributed Commit Protocols (Commit Algorithms)

❓Why do we need them?


πŸ”„ 1. Two-Phase Commit Protocol (2PC)

πŸ”‘ The most widely used commit algorithm in DDBS.

πŸ”ΉPhase 1 – Prepare Phase:

πŸ”ΉPhase 2 – Commit/Abort Phase:

βœ” Ensures:

❗Limitations:


πŸ”„ 2. Three-Phase Commit Protocol (3PC)

βœ… Designed to fix the blocking issue in 2PC
πŸ’‘ Adds a middle phase to ensure recovery is possible even if coordinator crashes

πŸ”ΉPhase 1 – CanCommit:

πŸ”ΉPhase 2 – PreCommit:

πŸ”ΉPhase 3 – Commit:

βœ” Advantage:


🧠 Real-World Analogy:

Imagine a group project submission:


βœ… Summary Table

Protocol Phases Blocking? Common Use
2PC 2 ❌ Yes βœ… Widely used
3PC 3 βœ… No πŸ§ͺ Rare (more complex)

πŸ“Œ How it connects to concurrency control:


Deadlock Management

❓ What is a Deadlock?

A deadlock occurs when a set of transactions wait indefinitely for each other to release locks on resources, creating a cycle of dependencies with no resolution.

In distributed systems, deadlocks are more complex due to:


πŸ’₯ Four Necessary Conditions for Deadlock

  1. Mutual Exclusion – Resources are held by one transaction at a time.
  2. Hold and Wait – Transactions hold resources while waiting for others.
  3. No Preemption – Resources cannot be forcibly taken.
  4. Circular Wait – A closed chain of transactions waiting on each other.

These must all be present for a deadlock to occur.


🧠 General Strategies for Deadlock Management

There are three approaches to managing deadlocks:

Approach Description
Prevention Ensure at least one of the four conditions never occurs.
Avoidance Use extra information to avoid unsafe states.
Detection & Recovery Allow deadlocks to occur but detect and recover.

Let’s understand each in a distributed context.


πŸ” 1. Deadlock Prevention (in Distributed Systems)

Idea: Prevent deadlocks by removing one of the four conditions.

πŸ”„ Methods:

🧊 Wait-Die vs. Wound-Wait (using timestamps):

Protocol If Older T wants a lock held by Younger T If Younger T wants a lock held by Older T
Wait-Die Wait Abort
Wound-Wait Preempt (force Younger T to abort) Wait

These protocols prevent circular wait.


🚦 2. Deadlock Avoidance

Idea: Dynamically assess if a transaction request could cause deadlock and deny risky requests.

βš™οΈ Implementation:

Challenge: Building and maintaining global WFGs has communication overhead.


πŸ” 3. Deadlock Detection & Recovery

Idea: Let deadlocks occur, then detect and recover.

πŸ“Š Detection:

πŸ› οΈ Recovery:

Once deadlock is detected:


βš–οΈ Comparison Table

Strategy Pros Cons
Prevention Simple logic, no deadlocks May abort unnecessarily
Avoidance Avoids deadlocks dynamically Needs global info, slow
Detection & Recovery No need to predict, deadlocks handled Overhead in detection + rollback

πŸ”„ Centralized vs Distributed Deadlock Handling

Type Description
Centralized One site is responsible for collecting WFGs and detecting cycles. Simple but single point of failure.
Distributed All sites participate in detecting deadlocks, better fault tolerance but more communication.

Detecting deadlocks using Wait-For graphs -- examples

πŸ” Wait-For Graph (WFG) – Refresher

πŸ” A cycle in the WFG = Deadlock!


Quick recap from graph theory, what is a Cycle in a graph and how to spot them?

A cycle in a graph is a path that starts and ends at the same vertex, and contains at least one other vertex. In simpler terms, it's a closed loop within the graph.

Pasted image 20250519224855.png


🧠 Example 1: Simple Deadlock

Let's say we have 3 transactions:
T1, T2, T3

βœ… Scenario:

Let's create a Wait-For graph to detect if there are any deadlocks

flowchart TD;
	T1-->T2
	T2-->T3
	T3-->T1

Cycle detected : T1 β†’ T2 β†’ T3 β†’ T1

βœ… This is a deadlock.
πŸ› οΈ Solution: Abort one of the transactions to break the cycle.


🧠 Example 2: No Deadlock

T1 β†’ T2
T2 β†’ T3
(no edge from T3 to any transaction)

flowchart TD;
	T1-->T2
	T2-->T3

No transaction is waiting on a dependent cycle β†’ βœ… No deadlock.


🧠 Example 3: Deadlock in Distributed Systems

Let’s consider 2 sites (S1 and S2) with local WFGs.

Site S1:

Site S2:


Independent WFGs

For S1:

flowchart TD;
	T1-->T2
	T2-->T3

For S2:

flowchart TD;
	T3-->T4
	T4-->T1

🌐 Merging them for Global WFG:

flowchart TD;
	T1-->T2
	T2-->T3
	T3-->T4
	T4-->T1

πŸ” Cycle:
βœ… Deadlock exists across sites β€” this is a distributed deadlock.