CST 402 - DISTRIBUTED COMPUTING
Question – Answer Bank
Part A
1. What do you meant by a distributed System?
A distributed system is a collection of independent entities that cooperate to
solve a problem that cannot be individually solved.
Collection of independent systems that work together to solve a problem or to
accomplish a task.
2. List the characteristic of a Distributed System
1. Crash of a single machine never prevents from doing work
2. A collection of computers that do not share common memory or a common
physical clock, that communicate by a messages passing over a communication
network, and where each computer has its own memory and runs its own
operating system
3. A collection of independent computers that appears to the users of the system
as a single coherent computer
4. A term that describes a wide range of computers, from weakly coupled systems
such as wide-area networks, to strongly coupled systems such as local area
networks,
3. What are the various features of distributed System?
No common physical clock:
No shared memory
Geographical separation
Autonomy and heterogeneity: The processors are “loosely coupled” in that they
have different speeds and each can be running a different operating system,
cooperate with one another by offering services or solving a problem jointly.
4. What are the motivation / advantages of a distributed system?
Inherently distributed computations : In many applications such as money
transfer in banking, or reaching consensus among parties that are
geographically distant, the computation is inherently distributed
Resource sharing
Access to geographically remote data and resources
Enhanced reliability
Reliability entails several aspects: availability, integrity, fault-tolerance,
Increased performance/cost ratio : By resource sharing and accessing
geographically remote data and resources, the performance/cost ratio is
increased
Scalability
Modularity and incremental expandability
5. Discuss about the transparency requirements of distributed system.
Transparency deals with hiding the implementation policies from the user, and can be
classified as follows
Access transparency hides differences in data representation on different
systems and provides uniform operations to access system resources.
Location transparency makes the locations of resources transparent to the users.
Migration transparency allows relocating resources without changing names.
Relocation transparency: The ability to relocate the resources as they are being
accessed is.
Replication transparency does not let the user become aware of any replication.
Concurrency transparency deals with masking the concurrent use of shared
resources for the user.
Failure transparency refers to the system being reliable and fault-tolerant.
6. List the algorithmic challenges in designing a distributed system.
Designing useful execution models and frameworks
Dynamic distributed graph algorithms and distributed routing algorithms
Time and global state in a distributed system
Synchronization/coordination mechanisms
Group communication, multicast, and ordered message delivery
Monitoring distributed events and predicates
Distributed program design and verification tools
Debugging distributed programs
Data replication, consistency models, and caching
7. List out few applications of distributed computing.
1. Mobile systems
2. Sensor networks
3. Ubiquitous or pervasive computing
4. Peer-to-peer computing
5. Publish-subscribe, content distribution, and multimedia
6. Distributed agents
7. Distributed data mining
8. What do you meant by a distributed program?
A distributed system consists of a set of processors that are connected by a
communication network.
The communication network provides the facility of information exchange among
processors.
The processors do not share a common global memory and communicate solely by
passing messages over the communication network.
A distributed program :
A distributed program is composed of a set of n asynchronous processes p1, p2,
, pi, , pn that communicate by message passing over the communication
network.
Without loss of generality, we assume that each process is running on a
different processor.
The processes do not share a global memory and communicate solely by
passing messages.
The global state of a distributed computation is composed of the states of the
processes and the communication channels
Process execution and message transfer are asynchronous.
The message transmission delay is finite and unpredictable.
9. Define casual precedence relation.
10. Compare and contrast physical & logical concurrency.
11. What do you meant by load balancing in a distributed environment?
The goal of load balancing is to gain higher throughput, and reduce the user perceived
latency.
Load balancing may be necessary because of a variety of factors such as high network
traffic or high request rate causing the network connection to be a bottleneck, or high
computational load
the objective is to service incoming client requests with the least turnaround time. S
The following are some forms of load balancing:
• Data migration The ability to move data (which may be replicated) around in the
system, based on the access pattern of the users.
• Computation migration The ability to relocate processes in order to perform a
redistribution of the workload.
• Distributed scheduling This achieves a better turnaround time for the users by using
idle processing power in the system more efficiently.
Part B
1. Relate a computer system with a distributed system with the aid of neat
sketches.
Each computer has a memory-processing unit and the computers are connected by a
communication network.
The distributed software is also termed as middleware. A distributed execution is the
execution of processes across the distributed system to collaboratively achieve a
common goal
The distributed system uses a layered architecture to break down the complexity of
system design.
The middleware is the distributed software that drives the distributed system, while
providing transparency of heterogeneity
2. Discuss about various Primitives for distributed communication.
Send() and Receive() primitives are used to send and receive messages
A Send primitive has at least two parameters – the destination, and the buffer in
the user space, containing the data to be sent
Receive primitive has at least two parameters –source from which the data is to
be received, and the user buffer into which the data is to be received.
There are two ways of sending data when the Send primitive is invoked –
1. the buffered option
2. the unbuffered option.
Synchronous primitives
A Send or a Receive primitive is synchronous if both the Send() and Receive()
handshake with each other. The processing for the Send primitive completes only after
the invoking processor learns that the other corresponding Receive primitive has also
been invoked and that the receive operation has been completed
Asynchronous primitives
A Send primitive is said to be asynchronous if control returns back to the invoking
process after the data item to be sent has been copied out of the user-specified buffer
Blocking primitives : A primitive is blocking if control returns to the invoking
process after the processing for the primitive (whether in synchronous or
asynchronous mode) completes.
Blocking Send : waits after sending message until its received by the receiver
and acknowledged
Blocking Receiver : Blocked until a message is received
Non-blocking primitives
A primitive is non-blocking if control returns back to the invoking process
immediately after invocation, even though the operation has not completed.
Non Blocking Send : Continues after sending message
Non Blocking Receiver : Continues even if message is received or not
3. Explain in detail about the design issues of a distributed System.
The following functions must be addressed when designing and building a distributed
system:
1. Communication
2. Processes
3. Naming
4. Synchronization
5. Data storage and access
6. Consistency and replication
7. Fault tolerance
8. Security
9. Applications Programming Interface (API) and transparency
10. Scalability and modularity
1. Communication
This task involves designing appropriate mechanisms for communication among the
processes in the network. Some example mechanisms are:
remote procedure call (RPC), remote object invocation (ROI), message-oriented
communication versus stream-oriented communication.
2. Processes
Some of the issues involved are: management of processes and threads at
clients/servers; code migration; and the design of software and mobile agents.
3. Naming
Devising easy to use and robust schemes for names, identifiers, and addresses is
essential for locating resources and processes in a transparent and scalable manner.
4. Synchronization Mechanisms
synchronization or coordination among the processes are essential.
Mutual exclusion is the classical example of synchronization,
In addition, synchronizing physical clocks, and devising logical clocks that capture the
essence of the passage of time,
5. Data storage and access
Schemes for data storage, and implicitly for accessing the data in a fast and scalable
manner across the network are important for efficiency.
Traditional issues such as file system design have to be reconsidered in the setting of a
distributed system.
6. Consistency and replication
To avoid bottlenecks, to provide fast access to data, and to provide scalability,
replication of data objects is highly desirable.
7. Fault tolerance
Fault tolerance requires maintaining correct and efficient operation in spite of any
failures of links, nodes, and processes.
Process resilience, reliable communication, distributed commit, checkpointing and
recovery, agreement and consensus, failure detection, and self-stabilization are some
of the mechanisms to provide fault-tolerance.
8. Security
Distributed systems security involves various aspects of cryptography, secure
channels, access control, key management – generation and distribution,
authorization, and secure group management.
9. Applications Programming Interface (API) and transparency
Transparency deals with hiding the implementation policies from the user, and can be
classified as follows
Access transparency hides differences in data representation on different
systems and provides uniform operations to access system resources.
Location transparency makes the locations of resources transparent to the users.
Migration transparency allows relocating resources without changing names.
Relocation transparency: The ability to relocate the resources as they are being
accessed is.
Replication transparency does not let the user become aware of any replication.
Concurrency transparency deals with masking the concurrent use of shared
resources for the user.
Failure transparency refers to the system being reliable and fault-tolerant.
10. Scalability and modularity
The algorithms, data (objects), and services must be as distributed as possible.
Various techniques such as replication, caching and cache management, and
asynchronous processing help to achieve scalability.
4. Explain the algorithmic challenges of designing a distributed system
Designing useful execution models and frameworks
Dynamic distributed graph algorithms and distributed routing algorithms
Time and global state in a distributed system
Synchronization/coordination mechanisms
Group communication, multicast, and ordered message delivery
Monitoring distributed events and predicates
Distributed program design and verification tools
Debugging distributed programs
Data replication, consistency models, and caching
Designing useful execution models and frameworks
The interleaving model and partial order model are two widely adopted models of
distributed system executions. They have proved to be particularly useful for
operational reasoning and the design of distributed algorithms.
2. Dynamic distributed graph algorithms and distributed routing algorithms
The distributed system is modeled as a distributed graph, and the graph algorithms
form the building blocks for a large number of higher level communication, data
dissemination, object location, and object search functions.
The algorithms need to deal with dynamically changing graph characteristics, such as
to model varying link loads in a routing algorithm.
3. Time and global state in a distributed system
The challenges pertain to providing accurate physical time, and to providing a variant
of time, called logical time Logical time is relative time, and eliminates the overheads
of providing physical time for applications where physical time is not required.
4. Synchronization/coordination mechanisms
Synchronization is essential for the distributed processes to overcome the limited
observation of the system state from the viewpoint of any one process.
Overcoming this limited observation is necessary for taking any actions that would
impact other processes.
Problems Requiring Synchronization
Physical clock synchronization
Leader election - All the processes need to agree on which process will play
the role of a distinguished process – called a leader process.
Mutual exclusion
Deadlock detection and resolution
Termination detection
Garbage collection - refers to objects that are no longer in use and that are not
pointed to by any other process. Detecting garbage requires coordination
among the processes
5. Group communication, multicast, and ordered message delivery
A group is a collection of processes that share a common context and collaborate on a
common task within an application domain.
Specific algorithms need to be designed to enable efficient group communication and
group management wherein processes can join and leave groups dynamically, or even
fail.
When multiple processes send messages concurrently, different recipients may
receive the messages in different orders, possibly violating the semantics of the
distributed program.
Hence, formal specifications of the semantics of ordered delivery need to be
formulated, and then implemented.
6. Monitoring distributed events and predicates
Predicates defined on program variables that are local to different processes are used
for specifying conditions on the global system state, and are useful for applications
such as debugging, sensing the environment, and in industrial process control.
On-line algorithms for monitoring such predicates are hence important.
8. Distributed program design and verification tools
Methodically designed and verifiably correct programs can greatly reduce the
overhead of software design, debugging, and engineering.
Designing mechanisms to achieve these design and verification goals is a challenge.
9. Debugging distributed programs
Debugging sequential programs is hard; debugging distributed programs is that much
harder because of the concurrency in actions and the ensuing uncertainty due to the
large number of possible executions defined by the interleaved concurrent actions.
Adequate debugging mechanisms and tools need to be designed to meet this
challenge.
10. Data replication, consistency models, and caching
Fast access to data and other resources requires them to be replicated in the distributed
system.
Managing such replicas in the face of updates introduces the problems of ensuring
consistency among the replicas and cached copies.
Additionally, placement of the replicas in the systems is also a challenge because
resources usually cannot be freely replicated.
5. Explain the applications of distributed computing.
8. Mobile systems
9. Sensor networks
10. Ubiquitous or pervasive computing
11. Peer-to-peer computing
12. Publish-subscribe, content distribution, and multimedia
13. Distributed agents
14. Distributed data mining
15. Grid computing
16. Security in distributed system
1. Mobile systems
Mobile systems typically use wireless communication which is based on
electromagnetic waves and utilizes a shared broadcast medium.
the characteristics of communication are different;
set of problems such as
i. routing,
ii. location management,
iii. channel allocation,
iv. localization and position estimation,
v. the overall management of mobility
There are two popular architectures for a mobile network.
1. base-station approach,
also known as the cellular approach, wherein a cell which is the geographical
region within range of a static but powerful base transmission station is
associated with that base station
2. ad-hoc network approach where there is no base station
All responsibility for communication is distributed among the mobile nodes, wherein
mobile nodes have to participate in routing by forwarding packets of other pairs of
communicating nodes
2. Sensor networks
A sensor is a processor with an electro-mechanical interface that is capable of sensing
physical parameters, such as temperature, velocity, pressure, humidity, and chemicals
Sensors may be mobile or static;
sensors may communicate wirelessly, although they may also communicate across a
wire when they are statically installed.
3. Ubiquitous or pervasive computing
The intelligent home, and the smart workplace are some example of ubiquitous
environments Ubiquitous systems are essentially distributed systems;
recent advances in technology allow them to leverage wireless communication and
sensor and actuator mechanisms
4. Peer-to-peer computing
• Peer-to-peer (P2P) computing represents computing over an application layer
network wherein all interactions among the processors are at a “peer” level,
without any hierarchy among the processors.
• P2P computing arose as a paradigm shift from client–server computing where
the roles among the processors are essentially asymmetrical.
• P2P networks are typically self-organizing, and may or may not have a regular
structure to the network.
5. Publish-subscribe, content distribution, and multimedia
In a dynamic environment where the information constantly fluctuates
there needs to be:
i. an efficient mechanism for distributing this information (publish),
ii. an efficient mechanism to allow end users to indicate interest in receiving
specific kinds of information (subscribe),
iii. an efficient mechanism for aggregating large volumes of published information
and filtering it as per the user’s subscription filter
6. Distributed agents
Agents collect and process information, and can exchange such information with other
agents
Challenges in distributed agent systems include coordination mechanisms among the
agents, controlling the mobility of the agents, and their software design and interfaces.
7. Distributed data mining
The data is necessarily distributed and cannot be collected in a single repository,
massive to collect and process at a single repository in real-time.
8. Grid computing
Grid Computing is a subset of distributed computing, where a virtual supercomputer
comprises machines on a network connected by some bus, mostly Ethernet or
sometimes the Internet.
idle CPU cycles of machines connected to the network will be available to others
9. Security in distributed systems
The traditional challenges of security in a distributed setting include:
confidentiality (ensuring that only authorized processes can access certain
information),
authentication (ensuring the source of received information and the identity of the
sending process),
availability (maintaining allowed access to services despite malicious actions).
6. Explain in detail about the models of distributed execution.
The execution of a process consists of a sequential execution of its actions.
The actions are atomic and the actions of a process are modeled as three types of
events,
1. internal events
2.message send events
3.message receive events
7. Explain the models of communication networks.
8. Discuss about the global state of distributed system
9. Explain in detail about the past and future cones of an event.
10. Discuss about the models of process communication.