KEMBAR78
Unit 5 Distributed Systems | PDF | Distributed Computing | Concurrency (Computer Science)
0% found this document useful (0 votes)
14 views25 pages

Unit 5 Distributed Systems

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views25 pages

Unit 5 Distributed Systems

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

1.A) Explain Suzuki-Kasami’s Broadcast Algorithm in detail.

Suzuki-Kasami’s Broadcast Algorithm:

Suzuki-Kasami’s algorithm is a token-based distributed mutual exclusion algorithm


designed for systems where processes need to access a shared resource in a mutually
exclusive manner. It operates efficiently in environments with low contention for the
resource by reducing the number of messages exchanged. The key idea is that only
the process holding the token has the right to enter the critical section (CS).

Data Structures and Notations:

1. Array of Integers: RN[1…N]

o Each site Si maintains its own array RNi[1…N].

o RNi[j]: The largest sequence number received by site Si from site Sj via
a REQUEST message. This reflects the most recent request made by Sj

2. Array of Integers: LN[1…N]

o This array is maintained by the token.

o LN[j]: The sequence number of the most recently executed request for
the critical section from site Sj . This is updated when a site releases the
critical section.

3. Queue Q

o The token maintains a queue Q.

o The queue holds the IDs of sites that are waiting for the token. These
are processes that have outstanding requests but are yet to receive the
token.

Algorithm Steps:

1. To Enter the Critical Section (Requesting the Token)

If a site Si wants to enter the critical section but does not hold the token, it follows
these steps:

1. Increment Request Number:

o Site Si increments its own sequence number in RNi[i]:

RNi[i]=RNi[i]+1RNi[i] = RNi[i] + 1RNi[i]=RNi[i]+1

o This indicates that Si is making a new request for the critical section.

2. Broadcast Request Message:


o Site Si sends a REQUEST(i, sn) message to all other sites, where:
sn=RNi[i]sn = RNi[i]sn=RNi[i]

o The request message includes the updated sequence number of the


request for the critical section.

3. When a Site Sj Receives a Request Message:

o When site Sj receives a REQUEST(i, sn) message from site Si:


RNj[i]=max(RNj[i],sn)RNj[i] = \text{max}(RNj[i], sn)RNj[i]=max(RNj[i],sn)

o This ensures that SjS_jSj tracks the most recent request from Si.

o After updating RNj[i], if Sj holds the token and: RNj[i]=LN[i]+1RNj[i] =


LN[i] + 1RNj[i]=LN[i]+1 (i.e., there is an outstanding request from Si that
hasn't been satisfied), then Sj sends the token to Si.

2. To Execute the Critical Section

Once a site Si receives the token, it can enter the critical section immediately and
execute its critical section code.

3. To Release the Critical Section

After finishing execution in the critical section, the site Si releases the critical section
by performing the following:

1. Update LN[i]:

o Site Si sets: LN[i]=RNi[i]LN[i] = RNi[i]LN[i]=RNi[i]

o This update indicates that the request associated with RNi[i] has now
been executed.

2. Check for Outstanding Requests:

o For every site Sj whose ID is not in the token queue QQQ, Si appends
Sj's ID to QQQ if: RNi[j]=LN[j]+1RNi[j] = LN[j] + 1RNi[j]=LN[j]+1

o This condition indicates that Sj has made an outstanding request for the
critical section.

3. Pass the Token:

o If the queue QQQ is non-empty, Si pops a site ID from the queue and
sends the token to the site indicated by the popped ID.

o If the queue QQQ is empty, Si keeps the token for future use.
Message Complexity

The message complexity of Suzuki-Kasami’s algorithm depends on whether the


token is already held by the requesting site:

1. Zero messages:

o If the site already holds the idle token when it requests entry into the
critical section, no messages need to be exchanged. The site can directly
enter the CS.

2. Maximum N messages:

o If the requesting site does not have the token, a total of N messages
are exchanged:

▪ N-1 request messages: To broadcast the request to all other sites.

▪ 1 reply message: To transfer the token to the requesting site.

Thus, in the worst-case scenario, it takes N messages per critical section execution.

1.B) Explain the models of DEADLOCKS

Models of Deadlocks

Deadlocks in distributed systems can be explained through different models, each


representing varying conditions and requirements for resource allocation and
deadlock detection. Below are the main points of each model:

1. Single Resource Model

• In this model, a process can have at most one outstanding request for a
single unit of a resource.

• The maximum out-degree of a node in the Wait-for Graph (WFG) is 1.

• The presence of a cycle in the WFG indicates a deadlock.

2. AND Model
• A process becomes active only after receiving a message from each process in
its dependent set.

• A process can request multiple resources simultaneously, and the request is


only fulfilled when all requested resources are granted.

• These requested resources might exist at different locations.

• The out-degree of a node in the WFG for this model can be more than 1.

• A cycle in the WFG indicates a deadlock, but a process may still be deadlocked
even if it's not part of a cycle.

3. OR Model

• A process can make requests for multiple resources, and the request is fulfilled
if any one of the resources is granted.

• In this model, the presence of a cycle in the WFG does not necessarily imply a
deadlock.

• A knot in the WFG indicates a deadlock.

• A process is deadlocked if:

1. It is blocked.

2. Its dependent set is a subset of blocked processes.

3. No grant messages are in transit between the blocked processes.

4. p-out-of-q Model
• This is a variation of the AND-OR model, where a request can be fulfilled by
obtaining any k resources from a pool of n resources.

• It is more compact and expressively equivalent to the AND-OR model.

5. Unrestricted Model

• No specific assumptions are made about the structure of resource requests.

• The only assumption is that the deadlock condition is stable.

• This model is the most general and allows a separation of concerns between
the properties of the problem (e.g., stability and deadlock) and the actual
computations of the distributed system(models of deadlocks).

2.A) Explain system model for Deadlock detection in distributed systems.

A deadlock occurs when a set of processes is stalled because each process is holding
a resource and waiting for another process to acquire another resource. In the diagram
below, for example, Process 1 is holding Resource 1 while Process 2 acquires Resource
2, and Process 2 is waiting for Resource 1.

The system model for deadlock detection in distributed systems provides a framework
for understanding how processes, resources, and communication interact in order to
detect and resolve deadlocks. In distributed systems, resources are distributed across
multiple nodes, and processes may request resources from remote nodes, making
deadlock detection more complex than in centralized systems. The system model is
defined by the following key components:

Processes and Resources:

• The system consists of multiple processes running on different nodes of a


distributed system. Each process can request, hold, or release resources.

• Resources are distributed across nodes, and these resources can be shared by
processes running on different nodes. Resources may include hardware devices,
files, or memory, among others.

• A process can request one or more resources, and it can proceed only when all
the requested resources are granted. If any of the requested resources are
unavailable, the process waits.

Resource Allocation and Requests:

• Each node has its own local resource manager that handles resource allocation
for processes on that node. The local resource manager tracks which resources
are held by which processes and manages resource requests and releases.

• When a process requests a resource that is not available (i.e., the resource is
already allocated to another process), the requesting process is placed in a
waiting state until the resource becomes available.

• Remote resource requests are managed by sending messages between nodes


when a process on one node requests a resource held by a process on another
node.

Resource Allocation Graph (RAG):

• Each node maintains a local Resource Allocation Graph (RAG) to track the
state of resource allocations and waiting processes.

o Nodes in the RAG represent processes or resources.

o Edges represent relationships between processes and resources:

▪ A directed edge from a process to a resource indicates that the


process is requesting the resource.

▪ A directed edge from a resource to a process indicates that the


process is holding the resource.

Wait-For Graph (WFG):

• The local RAG can be simplified into a Wait-For Graph (WFG), where:
o Nodes represent processes.

o Edges represent waiting relationships

• The WFG simplifies deadlock detection since deadlocks can be identified by


cycles in this graph.

Communication and Synchronization:

• Message passing is used to handle resource requests and releases between


nodes. When a process on one node needs a resource from another node, the
local resource manager sends a request message to the node holding the
resource.

• Inter-node communication is essential for constructing a global view of the


system. Each node periodically communicates information about its local WFG
to a central or distributed deadlock detector.

Distributed Deadlock Detection:

• Local Deadlock Detection: Each node has a local deadlock detector that
checks its local WFG for cycles, which indicate potential deadlocks. This
detection mechanism runs periodically to ensure timely detection.

• Global Deadlock Detection: Since deadlocks may span multiple nodes, the
system must coordinate to detect global deadlocks. This is done by collecting
information from each node’s local WFG and combining it into a global Wait-
For Graph (global WFG).

o The global deadlock detector (which could be a specific node or a


distributed algorithm) analyzes the global WFG for cycles that span
multiple nodes, indicating a deadlock involving processes on different
nodes.

Deadlock Resolution:

• Once a deadlock is detected, the system must resolve it. There are several
approaches for deadlock resolution in distributed systems:

o Process Termination: One or more processes involved in the deadlock


are terminated to release their resources and break the deadlock cycle.

o Resource Preemption: Resources are forcibly taken away from some


processes and allocated to others to allow the system to make progress.

o Rollback: Some processes are rolled back to a safe state before they
made their request, freeing up resources for other processes.
Continuous Monitoring and Verification:

• The system continuously monitors for deadlocks by regularly analyzing both


the local and global WFGs. Deadlock detection can be triggered either
periodically or in response to specific events, such as a resource request or
release.

• False Positives: Due to the nature of distributed systems and communication


delays, false positives may occur where a cycle is detected but resolves naturally
over time. Therefore, some systems perform a verification step before
initiating the resolution process to confirm the deadlock is still present.

2.B) Write a short notes on Deadlock Handling Strategies, Issues in deadlock


Detection.

Deadlock Handling Strategies

Handling deadlocks in distributed systems involves several strategies to detect,


prevent, or resolve deadlocks. The following are key strategies used for deadlock
handling:

1.Deadlock Prevention:

This approach ensures that the system is structured in such a way that deadlocks
cannot occur by avoiding at least one of the four necessary conditions (mutual
exclusion, hold and wait, no preemption, circular wait).

Techniques include:

• Avoid Hold-and-Wait: Processes must request all required resources at once


or release held resources if they need more.
• Preemption: Resources held by a process may be forcibly taken away and
allocated to other processes.
• Avoid Circular Wait: Impose a total ordering on resources and require
processes to request resources in increasing order of their priority.

2.Deadlock Avoidance:

This strategy involves making runtime decisions about resource allocation to avoid
deadlock. A well-known algorithm is Banker’s Algorithm, which assesses if resource
allocation leads to a safe state.

The system examines each request to ensure it will not result in a deadlock before
granting resources.
3.Deadlock Detection and Recovery:

The system allows deadlocks to occur but periodically checks for them using
algorithms that detect cycles in Wait-For Graphs (WFGs).

Once a deadlock is detected, recovery mechanisms are triggered:

Process Termination: One or more processes involved in the deadlock are killed.

Resource Preemption: Resources held by some processes are forcibly taken away and
allocated to others to resolve the deadlock.

4.Deadlock Ignorance:

Some systems, like most operating systems, use a "deadlock avoidance by ignoring"
strategy, where deadlocks are assumed to be rare and left unchecked. This is known
as the ostrich algorithm, where the system ignores deadlocks and may require
manual intervention if they occur.

Issues in Deadlock Detection

Deadlock detection in distributed systems introduces several challenges due to the


decentralized and asynchronous nature of these systems. Key issues include:

1.False Positives:

Due to communication delays and asynchrony, deadlock detectors may incorrectly


detect a cycle that is not a true deadlock. This happens because processes may resolve
their dependencies before the global system state is constructed, leading to false
positives.

2.Communication Overhead:

Collecting local wait-for graphs (WFGs) from different nodes and constructing a global
WFG incurs communication overhead. Frequent communication between nodes
increases network traffic and may degrade system performance.

3.Synchronization Issues:

Distributed systems are inherently asynchronous, and maintaining a consistent global


state across multiple nodes is difficult. Each node operates independently, and
synchronization delays can lead to inaccurate global states, where transient states are
mistakenly interpreted as deadlocks.
4.Scalability:

As the number of processes and nodes increases, the size and complexity of the global
WFG also increase. This can make deadlock detection more computationally
expensive, reducing scalability in large systems.

5.Recovery Complexity:

Once a deadlock is detected, determining the most efficient recovery strategy (e.g.,
choosing which process to terminate or which resources to preempt) can be
complicated. Choosing an optimal recovery strategy without affecting other
processes significantly is challenging.

6.Detection Latency:

Deadlocks may not be detected immediately due to periodic checks or delayed


communication, leading to latency in detecting and resolving deadlocks. During this
time, affected processes remain blocked, degrading system performance.

These issues highlight the complexities involved in managing deadlocks in distributed


systems, requiring careful consideration of detection and resolution mechanisms to
minimize system disruption.

3.A) Examine Lamport’s snapshot recording algorithm for determining the global
states of distributed systems?

Each distributed system has a number of processes running on a number of different


physical servers. These processes communicate with each other via communication
channels using text messaging. These processes neither have a shared memory nor a
common physical clock, this makes the process of determining the instantaneous
global state difficult.

A process could record it own local state at a given time but the messages that are in
transit (on its way to be delivered) would not be included in the recorded state and
hence the actual state of the system would be incorrect after the time in transit
message is delivered.

The main idea behind proposed algorithm is that if we know that all message
that have been sent by one process have been received by another then we can record
the global state of the system.

Any process in the distributed system can initiate this global state recording algorithm
using a special message called MARKER. This marker traverse the distributed system
across all communication channel and cause each process to record its own state. In
the end, the state of entire system (Global state) is recorded. This algorithm does not
interfere with normal execution of processes.

Algorithm:

• Marker sending rule for a process P :

o Process p records its own local state

o For each outgoing channel C from process P, P sends marker


along C before sending any other messages along C. (Note: Process Q
will receive this marker on his incoming channel C1.)

• Marker receiving rule for a process Q :

o If process Q has not yet recorded its own local state then

o Record the state of incoming channel C1 as an empty sequence


or null.

o After recording the state of incoming channel C1,


process Q Follows the marker sending rule

o If process Q has already recorded its state

o Record the state of incoming channel C1 as the sequence of


messages received along channel C1 after the state of Q was
recorded and before Q received the marker along C1 from
process P.

3.B) Explain MaeKawa’s voting algorithm in detail.

Maekawa’s Algorithm is quorum based approach to ensure mutual exclusion in


distributed systems. As we know, In permission based algorithms like Lamport’s
Algorithm, Ricart-Agrawala Algorithm etc. a site request permission from every other
site but in quorum based approach, A site does not request permission from every
other site but from a subset of sites which is called quorum. In this algorithm:

• Three type of messages ( REQUEST, REPLY and RELEASE) are used.

• A site send a REQUEST message to all other site in its request set or quorum to
get their permission to enter critical section.

• A site send a REPLY message to requesting site to give its permission to enter
the critical section.

• A site send a RELEASE message to all other site in its request set or quorum
upon exiting the critical section.
The construction of request set or Quorum: A request set or Quorum in Maekawa’s
algorithm must satisfy the following properties:
∀i ∀j : i ≠ j, 1 ≤ i, j ≤ N :: Ri ⋂ Rj ≠ ∅
1. i.e there is at least one common site between the request sets of
any two sites.
∀i : 1 ≤ i ≤ N :: Si ∊ Ri
∀i : 1 ≤ i ≤ N :: |Ri| = K

1. Any site Si is contained in exactly K sets.


N = K(K - 1) +1 and |Ri| = √N

Algorithm:

• To enter Critical section:

o When a site Si wants to enter the critical section, it sends a request


message REQUEST(i) to all other sites in the request set Ri.

o When a site Sj receives the request message REQUEST(i) from site Si, it
returns a REPLY message to site Si if it has not sent a REPLY message to
the site from the time it received the last RELEASE message. Otherwise,
it queues up the request.

• To execute the critical section:

o A site Si can enter the critical section if it has received


the REPLY message from all the site in request set Ri

• To release the critical section:

o When a site Si exits the critical section, it sends RELEASE(i) message to


all other sites in request set Ri

o When a site Sj receives the RELEASE(i) message from site Si, it


send REPLY message to the next site waiting in the queue and deletes
that entry from the queue

o In case queue is empty, site Sj update its status to show that it has not
sent any REPLY message since the receipt of the last RELEASE message

Detailed Example:

Let’s consider a distributed system with 9 processes P1,P2,…,P9P_1, P_2, \dots, P_9P1
,P2,…,P9. Quorums (request sets) for each process could be structured like this:

• RS1={P1,P2,P3}
• RS2={P2,P4,P5}

• RS3={P3,P6,P7}.

1. Request Phase:

Suppose P1P_1P1 wants to enter its critical section. It sends a request to


P1,P2,P3P_1, P_2, P_3P1,P2,P3 (its quorum). These processes check their queues
and grant their vote to P1P_1P1 if they are not voting for another process.

2. Enter Critical Section:

Once P1P_1P1 receives votes from all members of its request set, it enters its
critical section.

3. Release Phase:

After completing the critical section, P1P_1P1 sends release messages to


P1,P2,P3P_1, P_2, P_3P1,P2,P3. They can now vote for other processes waiting
in their queues.

Message Complexity: Maekawa’s Algorithm requires invocation of 3√N messages per


critical section execution as the size of a request set is √N. These 3√N messages
involves.

• √N request messages

• √N reply messages

• √N release messages

4.A) Explain Ricart-Agrawala Algorithm in detail.

• Ricart–Agrawala algorithm is an algorithm to for mutual exclusion in a distributed


system proposed by Glenn Ricart and Ashok Agrawala.

• This algorithm is an extension and optimization of Lamport’s Distributed Mutual


Exclusion Algorithm.

• It follows permission based approach to ensure mutual exclusion.

• Two type of messages ( REQUEST and REPLY) are used and communication channels
are assumed to follow FIFO order.

• A site send a REQUEST message to all other site to get their permission to enter
critical section.
• A site send a REPLY message to other site to give its permission to enter the critical
section.

• A timestamp is given to each critical section request using Lamport’s logical clock.

• Timestamp is used to determine priority of critical section requests.

• Smaller timestamp gets high priority over larger timestamp.

• The execution of critical section request is always in the order of their timestamp.

Message Complexity:

Ricart–Agrawala algorithm requires invocation of 2(N – 1) messages per critical section


execution. These 2(N – 1) messages involve:

• (N – 1) request messages

• (N – 1) reply messages

Drawbacks of Ricart–Agrawala algorithm:

• Unreliable approach: failure of any one of node in the system can halt the progress
of the system. In this situation, the process will starve forever. The problem of failure
of node can be solved by detecting failure after some timeout.

Performance: Synchronization delay is equal to maximum message transmission time


It requires 2(N – 1) messages per Critical section execution.
4.B) Discuss KNAPP’S classification of distributed deadlock detection algorithms in detail.

KNAPP'S Classification of Distributed Deadlock Detection Algorithms categorizes


distributed deadlock detection into four primary types, each using a distinct approach
to detect deadlocks in distributed systems.

1. Path-Pushing Algorithms: The core idea of path-pushing algorithms is to


maintain and propagate an explicit global Wait-For Graph (WFG) throughout
the system. In these algorithms, each site in the distributed system maintains its
local WFG. During deadlock detection, a site sends its local WFG to neighboring
sites. These neighboring sites update their graphs by incorporating the received
information and pass it on to other sites. This process repeats until one or more
sites can collectively determine whether a deadlock exists. The key feature of
path-pushing algorithms is the sending of paths of the global WFG across the
system. Prominent examples of path-pushing algorithms include the Menasce-
Muntz, Gligor and Shattuck, Ho and Ramamoorthy, and Obermarck
algorithms.

2. Edge-Chasing Algorithms: The edge-chasing approach is based on the idea


of detecting cycles in the distributed graph using special probe messages. In
this model, a probe is sent along the edges of the graph. If a site receives a
probe that it had previously sent, it indicates the presence of a cycle, thus
confirming a deadlock. Only blocked processes propagate probes along their
outgoing edges, while processes that are executing normally discard the
probes. The advantage of edge-chasing algorithms is that the probe messages
are fixed-size and generally short. Some well-known algorithms based on this
method include the algorithms proposed by Chandy et al., Choudhary et al.,
Kshemkalyani–Singhal, and Sinha–Natarajan.

3. Diffusing Computation-Based Algorithms: In diffusing computation-based


algorithms, deadlock detection is spread through the system's WFG using
diffusion techniques. These algorithms are built upon echo algorithms. A
process initiates deadlock detection by sending query messages along all
outgoing WFG edges. These query messages propagate through the graph, and
each blocked process waits for replies to the queries before responding to the
initiator. For subsequent queries, processes send immediate replies. The
initiator detects the deadlock upon receiving replies from all processes
involved in the computation. Chandy–Misra–Haas (for the OR model) and
Chandy–Herman algorithms are examples of diffusing computation-based
algorithms.
4. Global State Detection-Based Algorithms: Global state detection algorithms
focus on detecting deadlocks by capturing and analyzing a consistent
snapshot of the system. The key principle of this approach is that consistent
snapshots can be obtained without halting computations and that stable
properties (such as deadlocks) that existed before the snapshot was taken will
still be present in the snapshot. To detect deadlocks, the algorithm takes a
snapshot of the system and then examines it for any signs of deadlock. The
main advantage of this approach is that it allows the system to detect
deadlocks without interfering with ongoing computations. Examples include
algorithms based on system snapshot analysis.

5.A) Write about the Local and Global Wait-for graphs.

Local and Global Wait-For Graphs (WFGs)

In a distributed system, processes may compete for resources located on different


nodes. To manage and detect deadlocks, Wait-For Graphs (WFGs) are employed to
visualize which processes are waiting for resources held by other processes. There are
two types of WFGs: Local Wait-For Graphs and Global Wait-For Graphs, each serving
different roles in the system.

1. Local Wait-For Graph (Local WFG):

A Local Wait-For Graph is a representation of the resource dependency relationships


that exist at a single site or node in a distributed system. Each node maintains its own
local WFG to track dependencies between local processes, including:

• Which processes at the node are waiting for resources.

• Which resources are held by which processes at the node.

Structure of a Local WFG:

• Nodes: Each node in the graph represents a process running at that particular
site.

• Edges: Directed edges represent the waiting relationships between processes.


An edge from process Pi to Pj indicates that process Pi is waiting for a resource
held by process Pj.

Functionality:

• The local WFG simplifies deadlock detection within the node, allowing the
system to detect local deadlocks. If a cycle is detected in the local WFG, it
indicates that a deadlock has occurred among the processes at that node.
• Local WFGs are maintained and periodically checked by local deadlock
detectors at each node. However, they provide only a partial view of the
system's overall state because they do not account for inter-node
dependencies.

Limitations:

• Local WFGs cannot detect global deadlocks (deadlocks involving processes


across multiple nodes) because they only capture the waiting relationships
within a single node.

2. Global Wait-For Graph (Global WFG):

A Global Wait-For Graph is a comprehensive representation of the dependency


relationships across the entire distributed system. It is constructed by combining the
local WFGs from each node, along with the dependencies between processes on
different nodes.

Structure of a Global WFG:

• Nodes: Each node in the global WFG represents a process, regardless of which
node it is running on.

• Edges: Directed edges represent waiting relationships both within and across
nodes. An edge from process Pi on node A to process Pj on node B indicates
that Pi is waiting for a resource held by Pj on a different node.

Functionality:

• The global WFG is crucial for detecting global deadlocks, which occur when
processes from different nodes form a circular dependency.

• To construct a global WFG, the local WFGs are combined with information
about inter-node dependencies. This is often done by exchanging messages
between nodes, where each node sends its local WFG to a global deadlock
detector or to other nodes.

Global Deadlock Detection:

• The global WFG is analyzed for cycles, which indicate the presence of
deadlocks that span multiple nodes. If a cycle is found in the global WFG, it
signals that processes across different nodes are involved in a deadlock.

• The detection of global deadlocks can be more complex and costly because it
requires inter-node communication and coordination between local and
global detectors.
Comparison of Local and Global WFGs:

Aspect Local WFG Global WFG

Tracks dependencies within Tracks dependencies across all


Scope
a single node nodes

Node Represents processes within Represents processes across the


Representation one node entire system

Edge Edges represent local Edges represent local and inter-


Representation waiting relationships node dependencies

Deadlock
Detects local deadlocks only Detects global deadlocks
Detection

High, as nodes need to exchange


Communication Minimal, as no inter-node
information to form the global
Overhead communication is required
WFG

Cycles indicate local


Cycle Detection Cycles indicate global deadlocks
deadlocks

Updated locally by each Constructed through inter-node


Updates
node communication

5.B) What are the requirements for mutual exclusion in Distributed systems?
Explain about various metrics used for evaluating the performance of mutual
exclusion algorithms in Distributed systems.

Requirements for Mutual Exclusion in Distributed Systems

In distributed systems, mutual exclusion ensures that only one process at a time can
access a shared resource or critical section (CS). The key requirements for mutual
exclusion in distributed systems are:

Mutual Exclusion:

At any given time, only one process is allowed to access the critical section. No two
processes can be in the critical section simultaneously.

Freedom from Deadlock:

The system must be free from deadlock, meaning no set of processes should block
each other indefinitely while waiting to access the critical section.
Freedom from Starvation:

Every process that requests access to the critical section must eventually be granted
access, ensuring fairness. No process should be indefinitely postponed or denied
entry to the critical section.

Fault Tolerance:

The algorithm must be able to handle failures in the system, such as process crashes
or communication failures, without violating the mutual exclusion property.

Fairness:

The algorithm should ensure that all processes get a fair chance to enter the critical
section, with requests being granted based on some priority (e.g., timestamp or
request order). There should be no undue advantage for any process.

No Assumptions of Global Clock:

Distributed systems typically lack a global clock, so algorithms must rely on local
timestamps or logical clocks to enforce ordering.

Low Overhead:

The algorithm should incur minimal communication and computation overhead to


reduce the cost of enforcing mutual exclusion in a distributed environment.

Metrics for Evaluating the Performance of Mutual Exclusion Algorithms

Various metrics are used to evaluate the performance of mutual exclusion algorithms
in distributed systems. These metrics help assess the efficiency and scalability of the
algorithms in terms of communication, time complexity, and system load.

1. Message Complexity

Message complexity refers to the total number of messages exchanged between


processes to request and receive permission for entering and exiting the critical
section. In distributed systems, minimizing message complexity is crucial because
network communication can be slow, unreliable, and resource-intensive. Algorithms
with lower message complexity reduce network traffic and enhance system efficiency.
Ideally, an algorithm should require O(1) messages or a small constant number of
messages for each critical section entry, minimizing the overhead. For example, the
Ricart–Agrawala algorithm requires 2(N−1) messages (N being the number of
processes), while Maekawa’s algorithm reduces message complexity by requiring
messages proportional to square of N Algorithms that minimize message exchanges
perform better in large-scale distributed systems.
2. Synchronization Delay (Time Complexity)

Synchronization delay measures the time a process waits to enter the critical section
after receiving permission to do so. It reflects how quickly a process can proceed after
permission is granted. Reducing synchronization delay ensures faster access to the
critical section, improving the system's overall responsiveness. In scenarios where the
system requires multiple rounds of communication or global coordination,
synchronization delays can be high, slowing down the system. In contrast, token-
based algorithms typically have low synchronization delays because once a process
acquires the token, it can immediately enter the critical section. An ideal algorithm
minimizes this delay, allowing processes to access the shared resource as soon as they
have permission.

3. Response Time

Response time refers to the total time elapsed between when a process requests
access to the critical section and when it actually gains access. A shorter response time
indicates that the algorithm efficiently handles requests and provides quicker access
to shared resources. Minimizing response time is vital in systems where processes
frequently compete for the critical section, as long delays can lead to reduced system
performance and high contention. Algorithms that require multiple message
exchanges or that suffer from resource contention will generally exhibit higher
response times. To improve performance, an ideal algorithm should keep the response
time as low as possible, enabling prompt access to the critical section without
unnecessary delays.

4. Bandwidth Utilization

Bandwidth utilization measures the amount of network bandwidth consumed by an


algorithm while exchanging messages between processes. High bandwidth usage can
lead to network congestion and degrade system performance, especially in distributed
systems with limited network capacity. Therefore, an ideal algorithm minimizes
bandwidth consumption by reducing the number of messages or limiting the size of
messages exchanged. This ensures efficient use of network resources, allowing the
system to handle other tasks without being overwhelmed by mutual exclusion-related
communication. Algorithms that rely heavily on message exchanges or large message
sizes can overload the network, affecting not only mutual exclusion but other system
processes as well.
5. Scalability

Scalability evaluates how well the algorithm performs as the number of processes in
the system increases. In distributed systems, the algorithm should be able to handle a
growing number of processes without a significant increase in overhead or
performance degradation. Algorithms that rely on centralized coordination often
struggle to scale because a single coordinator becomes a bottleneck as the system
grows. In contrast, token-based algorithms scale well because only one message (the
token) is passed around, regardless of the number of processes. An ideal mutual
exclusion algorithm should exhibit linear or sub-linear scaling, ensuring that the
system remains efficient even as it expands.

6. Fault Tolerance

Fault tolerance refers to the algorithm's ability to continue functioning correctly in


the presence of failures, such as process crashes or communication breakdowns.
Distributed systems are prone to such failures, and mutual exclusion algorithms must
be designed to handle them without compromising the mutual exclusion property. An
algorithm is considered fault-tolerant if it can recover from failures without significant
overhead or system disruption. For example, in token-based algorithms, the loss of
the token must be addressed through efficient token recovery mechanisms. Fault-
tolerant algorithms ensure that failures do not lead to deadlocks, excessive delays, or
system failure, making them essential in distributed environments.

7. Fairness and Starvation Freedom

Fairness refers to the algorithm’s ability to ensure that all processes have an equal
chance to access the critical section, and starvation freedom guarantees that no
process is indefinitely delayed or denied access. This is critical in maintaining the
integrity and performance of distributed systems, especially when multiple processes
are contending for the same resource. An ideal algorithm should provide fair access
to the critical section, typically by granting access in the order of requests or based on
timestamps. For example, algorithms using Lamport’s logical clock ensure fairness by
prioritizing processes with smaller timestamps. Without fairness and starvation
freedom, certain processes could monopolize resources, leading to performance
degradation and system inefficiency.

8. Token Loss Handling

Token loss handling is a metric specific to token-based algorithms, where mutual


exclusion is enforced by passing a token among processes. If the token is lost (due to
process failure or network issues), the system could become deadlocked, with no
process able to access the critical section. Thus, the algorithm must include
mechanisms for regenerating or recovering the lost token to ensure continued
operation. Token loss handling mechanisms should be efficient, avoiding excessive
delays or additional message overhead during recovery. An ideal token-based
algorithm will have robust strategies for detecting token loss and restoring system
operation quickly, ensuring that token loss does not result in system downtime or
deadlock.

6.A) Explain algorithm for single resource model?

In the Single Resource Model, each process can make only one outstanding request
for a single unit of a resource at a time. The goal of the algorithm is to detect deadlocks
where processes are waiting for resources held by other processes in a system where
each process is only allowed to request one resource at a time. This model can be
represented using a Wait-For Graph (WFG), which is a directed graph used to track
dependencies between processes and the resources they are waiting for.

Mitchell and Merritt’s Algorithm for the Single Resource Model

Mitchell and Merritt's Algorithm is a distributed algorithm designed to handle


mutual exclusion for systems following the single resource model. In this model,
processes compete for exclusive access to a single shared resource at a time. The
algorithm was developed to ensure that processes can safely access the critical section
(CS) without causing deadlock or violating mutual exclusion, particularly in
environments where processes communicate asynchronously.

Steps of Mitchell and Merritt's Algorithm:

1. Initial State:

• There is only one token in the system, and initially, one process holds it,
allowing it to access the critical section.

• Each process keeps track of the state of the token: whether it holds the token
or has requested it from another process.

2. Requesting the Critical Section:

When a process, say Process PiP_iPi, wants to enter the critical section:

• If it does not hold the token, it sends a request message to the process that
currently holds the token.

• The request includes the process ID and a timestamp (usually based on logical
clocks) to ensure fairness and avoid starvation. The timestamp is used to
prioritize requests.

• The process then waits for the token to be passed to it.


3. Passing the Token:

• The process that holds the token, when it receives a request message from
another process, must decide whether to pass the token or hold on to it.

• If the current process is not in the critical section or has finished its critical
section, it passes the token to the requesting process with the oldest request
(i.e., the one with the smallest timestamp).

• The token is sent directly to the requesting process.

4. Entering the Critical Section:

• A process enters the critical section as soon as it receives the token.

• While the process holds the token, no other process can access the critical
section because the token is required to do so.

5. Exiting the Critical Section:

• After completing its work in the critical section, the process checks if there are
any pending requests for the token.

• If there are pending requests, it passes the token to the process with the oldest
request.

• If there are no requests, the process keeps the token and can reuse it the next
time it wants to enter the critical section.

6. Request Queue Management:

• Each process maintains a queue of pending requests it has received from other
processes.

• The queue helps the process decide which request to prioritize when passing
the token after exiting the critical section.

• Requests are ordered by their timestamps, ensuring that the system gives
priority to earlier requests.

Mitchell and Merritt’s Algorithm provides an efficient solution for achieving mutual
exclusion in distributed systems with a single shared resource. Its use of a token-
based approach ensures that only one process can access the critical section at a time,
preventing both deadlock and starvation
6.A) Distinguish about AND model and OR Model?

Aspect AND Model in Distributed Systems OR Model in Distributed Systems

Strong consistency required (all nodes Eventual consistency (any node can
Consistency
must agree) succeed)

Lower fault tolerance; must meet all Higher fault tolerance; success if
Fault Tolerance
conditions to succeed any condition is met

Can lower availability due to strict Higher availability due to leniency


Availability
conditions in conditions

Paxos, Raft, Quorum- Cassandra, DynamoDB, Eventual


Example System
based replication systems consistency systems

Network Often fails in the presence of Can tolerate network partitions

Partitioning partitioning (CAP theorem)

Concurrency Requires synchronization between all Can continue with partial success
Control participants inconcurrent environments

All replicas must be updated or Success if at least one replica


Data Replication
acknowledged to succeed acknowledges the operation

Lower coordination overhead, but p


Coordination Higher coordination overhead as all
otential for temporary
Complexity participants must agree
inconsistency

Can lead to deadlocks if all resources


Deadlock Less prone to deadlocks due to the
are needed simultaneously
Prevention flexibility in resource allocation
(strict locking)
Aspect AND Model in Distributed Systems OR Model in Distributed Systems

Requires a majority or all nodes to


Transaction Transactions can be committed if
agree for a transaction to be
Consistency any one node succeeds
committed

Consistency Guara Strong consistency, but can result in Weaker consistency (eventual consi
ntees more frequent blockages stency) but more fault-tolerant

DynamoDB, Cassandra – only one n


Algorithm Exampl Paxos, Raft – all nodes must agree on
ode must acknowledge the operati
e for Consensus the same value
on

Resource Locking High risk of deadlocks if resources are Reduced risk of deadlocks, as resou
in Deadlock locked by all nodes rces are less strictly locked

You might also like