Introduction to Distributed Systems
A Distributed System (DS) is a collection of independent computers
that appears to its users as a single coherent system. These systems are
interconnected, work together to achieve a common goal, and share
resources such as data, hardware, and software.
Characteristics of Distributed Systems
1. Resource Sharing: Resources such as files, printers, and
processors are shared across the system.
2. Scalability: Distributed systems can handle an increasing number
of nodes or users without performance degradation.
3. Concurrency: Multiple users or processes can access shared
resources simultaneously.
4. Fault Tolerance: The system continues to operate correctly even in
the presence of hardware or software failures.
5. Transparency: The complexities of the underlying system are
hidden from users, ensuring ease of interaction.
Examples of Distributed Systems
The Internet
Cloud computing platforms
Distributed databases (e.g., MongoDB, Cassandra)
Online gaming networks
Peer-to-peer networks (e.g., BitTorrent)
Distributed System Models
Distributed systems can be modeled in various ways based on the
organization, interactions, and processes involved. These models help in
designing and understanding distributed architectures.
1. Architectural Models
Architectural models describe the organization of the distributed system's
components and how they interact.
a. Client-Server Model
Definition: The system is divided into servers that provide services
and clients that consume them.
Examples: Web servers, email servers, and database servers.
Characteristics:
o Centralized control at the server.
o Clients request services; servers process and respond.
o Easier to manage but may become a bottleneck as demand
increases.
Advantages:
o Centralized data control.
o Simplified security.
Disadvantages:
o Scalability issues with increased client requests.
o Single point of failure if the server crashes.
b. Peer-to-Peer (P2P) Model
Definition: All nodes (peers) in the system are equal and can act as
both clients and servers.
Examples: BitTorrent, blockchain systems.
Characteristics:
o Decentralized architecture.
o Resources are shared directly between nodes.
o Highly scalable and robust to failures.
Advantages:
o No single point of failure.
o Excellent for resource sharing and fault tolerance.
Disadvantages:
o Security is more complex due to decentralization.
o Managing consistency across peers is challenging.
c. Hybrid Model
Definition: Combines client-server and P2P features.
Examples: Content delivery networks (CDNs) like Akamai.
Characteristics:
o Servers handle critical tasks, while peers share additional
resources.
o Provides flexibility in design.
Advantages:
o Balances control and scalability.
o Can take advantage of both models.
Disadvantages:
o More complex to design and implement.
2. Interaction Models
Interaction models describe the communication and coordination between
processes in a distributed system.
a. Synchronous Communication
Definition: Communication where senders and receivers are tightly
synchronized.
Examples: Remote Procedure Calls (RPCs).
Characteristics:
o Sender waits for the receiver’s acknowledgment before
proceeding.
o Predictable response time.
Advantages:
o Easier to debug and reason about.
o Simpler communication protocols.
Disadvantages:
o Higher latency as processes wait for each other.
o Not suitable for high-latency networks.
b. Asynchronous Communication
Definition: Communication where senders do not wait for receivers
to respond immediately.
Examples: Message queuing systems like RabbitMQ or Kafka.
Characteristics:
o Sender continues processing after sending a message.
o Communication can overlap with computation.
Advantages:
o Improves system performance and scalability.
o Tolerant of network delays.
Disadvantages:
o Harder to program and debug.
o Requires additional mechanisms for handling failures.
3. Failure Models
Failure models define the types of faults a distributed system can
experience.
a. Crash Failures
Definition: A component stops functioning and does not recover.
Examples: A server going offline unexpectedly.
Impact:
o Can disrupt service availability.
o Detectable through heartbeat signals or timeout mechanisms.
b. Omission Failures
Definition: Failures where messages are lost or not delivered.
Examples: Network packet loss.
Impact:
o Causes incomplete communication.
o Requires retransmission mechanisms.
c. Timing Failures
Definition: Operations take longer than expected or occur at
unpredictable times.
Examples: Slow response from a web server.
Impact:
o Leads to timeouts and degraded user experience.
o Addressed by setting time limits on operations.
d. Byzantine Failures
Definition: Components provide incorrect or malicious information.
Examples: Compromised nodes in a blockchain.
Impact:
o Hardest type of failure to handle.
o Mitigated using consensus algorithms like Practical Byzantine
Fault Tolerance (PBFT).
4. Consistency Models
Consistency models describe how the system manages updates to shared
data across nodes.
a. Strict Consistency
Definition: Updates are instantly visible to all nodes
simultaneously.
Examples: A centralized database with a single master.
Characteristics:
o Provides the highest level of consistency.
o Requires synchronized clocks or mechanisms.
Advantages:
o Simplifies reasoning about data.
Disadvantages:
o Not practical in distributed systems due to high
communication overhead.
b. Eventual Consistency
Definition: Updates eventually propagate to all nodes, and the
system reaches a consistent state.
Examples: DNS systems, NoSQL databases like Cassandra.
Characteristics:
o Allows temporary inconsistency.
o Suitable for systems prioritizing availability over consistency.
Advantages:
o High performance and scalability.
o Suitable for large-scale systems.
Disadvantages:
o Temporary inconsistencies may lead to conflicts.
c. Causal Consistency
Definition: Operations that are causally related must appear in the
same order across all nodes.
Examples: Version control systems like Git.
Characteristics:
o Tracks dependencies between operations.
o Less strict than strict consistency but more predictable than
eventual consistency.
Advantages:
o Ensures logical ordering of events.
Disadvantages:
o Increases metadata overhead for tracking dependencies.
Design Issues in Distributed Systems
Designing a distributed system involves addressing several key challenges
to ensure efficiency, reliability, and scalability.
1. Transparency
Access Transparency: Users should not need to know whether
they are accessing local or remote resources.
Location Transparency: The physical location of resources should
not affect user interaction.
Replication Transparency: Users should not be aware of data
replication.
Concurrency Transparency: Multiple users can access shared
resources without interference.
Failure Transparency: The system should mask failures and
recover automatically.
2. Scalability
A distributed system should handle growth efficiently:
Adding more nodes should not degrade performance.
Techniques like caching, replication, and load balancing are
employed.
3. Reliability
Ensuring the system continues to operate correctly even when
components fail:
Fault detection and recovery mechanisms.
Use of redundant systems to handle failures.
4. Resource Management
Efficiently allocating and managing resources such as memory, CPU, and
storage:
Dynamic load balancing to distribute tasks evenly.
Ensuring fair access to resources for all users.
5. Communication
Enabling seamless communication between distributed components:
Using protocols like TCP/IP, HTTP, or gRPC.
Handling issues like message loss, latency, and bandwidth
constraints.
6. Security
Protecting the system against unauthorized access, data breaches, and
attacks:
Implementing authentication, encryption, and firewalls.
Regularly updating software to patch vulnerabilities.
Communication in Distributed Systems
Communication is a critical aspect of distributed systems since it enables
the exchange of information between processes running on different
machines. The primary focus is on inter-process communication (IPC),
which involves various methods like message passing, remote procedure
calls (RPC), and socket programming.
Inter-Process Communication (IPC)
Inter-Process Communication is the mechanism by which processes
communicate and synchronize their actions. It is essential for coordinating
operations, sharing data, and ensuring consistency in distributed systems.
Models of IPC
1. Message Passing Model:
o Processes communicate by explicitly sending and receiving
messages.
o This model provides explicit control over the communication
process and is suitable for systems where processes are
loosely coupled.
o Key Features:
Synchronous and asynchronous communication.
Reliable or unreliable message delivery.
Examples: Email systems, messaging queues.
o Advantages:
No shared memory is required.
Clear and explicit communication semantics.
o Disadvantages:
Can be complex to implement in large systems.
2. Shared Memory Model:
o Processes communicate by sharing a common memory space.
o Not typically used in distributed systems due to difficulties in
maintaining consistency across nodes.
Remote Procedure Call (RPC)
RPC is a protocol that enables a process to execute a procedure (or
function) on another machine as if it were a local procedure call.
Steps in an RPC:
1. Client Stub: Marshals (packs) the procedure arguments into a
message.
2. Communication Module: Sends the message to the server.
3. Server Stub: Unmarshals (unpacks) the message and invokes the
procedure.
4. Server Application: Executes the procedure and sends the result
back through the same mechanism.
Implementation Issues:
1. Transparency:
o The client should not know that the procedure is being
executed remotely.
o Transparency includes handling location, access, and
communication complexities.
2. Stub Generation:
o Stubs act as proxies for actual procedures.
o They are automatically generated to simplify the programming
model.
3. Marshalling and Unmarshalling:
o Arguments and return values must be converted to a standard
format to be transmitted across the network.
4. Error Handling:
o The system must handle errors such as communication
failures, timeouts, or server crashes.
Advantages:
Simplifies distributed programming by abstracting network
communication.
Promotes modularity and code reuse.
Disadvantages:
Performance overhead due to marshalling and network latency.
Difficult to handle partial failures (e.g., when only part of the system
crashes).
Point-to-Point and Group Communication
1. Point-to-Point Communication:
o Involves direct communication between two processes.
o Characteristics:
Simple and efficient for two-party interactions.
Can use protocols like TCP or UDP.
o Use Cases: File transfer, remote procedure calls.
2. Group Communication:
o Involves communication between one sender and multiple
receivers (multicast).
o Characteristics:
Used in scenarios where updates or messages must
reach a group of processes.
Requires mechanisms to handle group membership and
consistency.
o Use Cases: Video conferencing, collaborative applications.
Client-Server Model & Its Implementation
Client-Server Model:
A centralized architecture where the server provides services, and
the client consumes them.
Implementation:
o Server: Listens for incoming client requests and processes
them.
o Client: Initiates communication, sends requests, and receives
responses.
o Communication is usually handled using sockets or higher-
level abstractions like HTTP or gRPC.
Socket Programming
Sockets are endpoints for communication between two nodes. Socket
programming provides a way to send and receive data across a network.
Steps in Socket Programming:
1. Server Side:
o Create a socket.
o Bind it to a specific port and IP address.
o Listen for incoming connections.
o Accept connections and communicate.
2. Client Side:
o Create a socket.
o Connect to the server’s IP and port.
o Communicate with the server.
Socket Types:
1. Stream Sockets (TCP):
o Reliable, connection-oriented communication.
o Guarantees message delivery in the correct order.
o Use Cases: File transfer, web browsing.
2. Datagram Sockets (UDP):
o Connectionless, unreliable communication.
o Faster but does not guarantee delivery or order.
o Use Cases: Streaming media, gaming.
Case Studies
1. SUN RPC:
Developed by Sun Microsystems for remote procedure calls.
Features:
o Simple interface for calling remote procedures.
o Platform-independent, with a focus on transparency.
o Uses the External Data Representation (XDR) format for
data marshalling.
2. DEC RPC:
Developed by Digital Equipment Corporation for distributed
applications.
Features:
o Support for synchronous and asynchronous RPC.
o Offers mechanisms for handling complex data types.
o Incorporates robust error handling and recovery mechanisms.
Synchronization in Distributed Systems
Synchronization in distributed systems ensures that operations across
multiple, independent processes occur in a coordinated and consistent
manner. Due to the lack of shared memory and varying local clocks on
different nodes, synchronization is a fundamental challenge in distributed
systems.
Introduction
Synchronization involves aligning clocks, events, or processes to ensure
the system behaves predictably and consistently. Some critical aspects
include:
Time synchronization: Ensuring all nodes have a consistent view
of time.
Event ordering: Establishing a consistent order for events across
processes.
Process coordination: Enabling distributed processes to cooperate
effectively (e.g., in mutual exclusion or leader election).
Temporal Ordering of Events
In distributed systems, ensuring a consistent order of events is
challenging due to:
Lack of a global clock.
Variable network delays.
Happens-Before Relation (Lamport’s Logical Clock):
Denoted as a→ba \rightarrow ba→b, meaning event aaa happens
before event bbb.
Defined as:
1. If two events occur in the same process, the one earlier in
program order happens before.
2. If an event aaa sends a message, and bbb receives it, then
a→ba \rightarrow ba→b.
3. Transitivity: If a→ba \rightarrow ba→b and b→cb \rightarrow
cb→c, then a→ca \rightarrow ca→c.
Logical Clocks:
Logical clocks assign timestamps to events to capture their order.
Examples include:
1. Lamport Clocks:
o Each process maintains a logical clock CCC.
o Rules:
Increment CCC before any event.
When sending a message, include CCC.
On receiving a message, C=max(C,Csender)+1C = \
max(C, C_{sender}) + 1C=max(C,Csender)+1.
2. Vector Clocks:
o Each process maintains a vector of clocks.
o Provides more precise causality information than Lamport
clocks.
o Rule: Update the vector based on the event and received
messages.
Clock Synchronization
Due to the absence of a global clock, nodes must synchronize their clocks
to maintain consistency.
Types of Synchronization:
1. External Synchronization:
o Synchronize clocks to an external time source (e.g., UTC).
2. Internal Synchronization:
o Synchronize clocks with respect to each other to ensure
consistent timestamps.
Clock Synchronization Algorithms:
1. Cristian’s Algorithm:
o Assumes a time server with accurate time.
o Steps:
Client sends a request to the server.
Server replies with its current time.
Client adjusts its clock based on the round-trip time
(RTT) and server time.
2. Berkeley Algorithm:
o Averages the clocks of all nodes in the system.
o Steps:
A master node polls other nodes for their clocks.
Computes the average time, accounting for message
delays.
Sends adjustments to all nodes.
3. Network Time Protocol (NTP):
o A widely used protocol to synchronize clocks over the internet.
o Uses a hierarchical structure of time servers.
o Achieves high accuracy through multiple rounds of time
exchanges.
Mutual Exclusion
Mutual exclusion ensures that only one process accesses a critical section
at any time.
Challenges in Distributed Systems:
Lack of shared memory.
Communication delays and failures.
Distributed Mutual Exclusion Algorithms:
1. Centralized Algorithm:
o A coordinator manages access to the critical section.
o Steps:
Processes send requests to the coordinator.
Coordinator grants or denies access based on the
current state.
o Advantages: Simple, ensures fairness.
o Disadvantages: Single point of failure, potential bottleneck.
2. Ricart-Agrawala Algorithm:
o A decentralized algorithm using timestamps.
o Steps:
A process broadcasts a request to all other processes.
Receives replies from all processes before entering the
critical section.
Processes delay replies if they are in or waiting for the
critical section.
o Advantages: No single point of failure.
o Disadvantages: High message overhead.
3. Token-Based Algorithms:
o A unique token grants access to the critical section.
o Processes pass the token as needed.
o Advantages: Low message overhead.
o Disadvantages: Token loss requires recovery mechanisms.
Deadlock in Distributed Systems
Deadlock occurs when processes wait indefinitely for resources held by
each other.
Conditions for Deadlock:
1. Mutual Exclusion: Resources are non-shareable.
2. Hold and Wait: Processes hold resources while waiting for others.
3. No Preemption: Resources cannot be forcibly taken.
4. Circular Wait: A circular chain of processes exists, where each is
waiting for a resource held by the next.
Deadlock Detection and Recovery:
Maintain a wait-for graph to detect cycles.
Break deadlocks by:
o Preempting resources.
o Terminating processes.
Deadlock Prevention:
Avoid one of the four necessary conditions for deadlock.
Examples:
o Allow preemption.
o Avoid circular wait by assigning priorities to resources.
Election Algorithms
Election algorithms are used to select a coordinator or leader in a
distributed system.
Requirements:
Fault tolerance.
Minimal message overhead.
Popular Algorithms:
1. Bully Algorithm:
o Assumes processes have unique IDs.
o Steps:
A process detects the failure of the coordinator.
It initiates an election by sending messages to all
higher-ID processes.
The process with the highest ID becomes the new
coordinator.
2. Ring Algorithm:
o Assumes processes are arranged in a logical ring.
o Steps:
A process starts an election by sending its ID to its
neighbor.
Each process forwards the largest ID it has seen.
The process with the highest ID becomes the new
leader.
Remote Method Invocation (RMI)
Introduction
Remote Method Invocation (RMI) is a mechanism that allows objects in
one Java Virtual Machine (JVM) to invoke methods on objects in another
JVM. RMI is a key component in Java’s distributed computing capabilities,
enabling distributed systems to interact using objects, as if they were on
the same machine. Through RMI, Java applications can communicate over
a network while maintaining the simplicity of object-oriented
programming.
RMI hides the complexities of low-level network communication and object
serialization, making remote communication appear as if it is local method
invocation. It is a higher-level abstraction that allows for simpler
communication between distributed applications.
RMI is inherently built into Java and uses standard Java objects and
interfaces. The communication between different JVMs is done using
serialization (converting objects to byte streams for transmission) and
TCP/IP (for reliable communication over the network).
Key Concepts in RMI
1. Remote Objects:
o A remote object is an object that resides on a different JVM
and is accessible remotely.
o The object must implement the Remote interface, signaling
that its methods can be invoked remotely.
o Remote objects provide the interface that allows clients to
invoke methods remotely.
2. Stub:
o A stub is a client-side proxy for a remote object.
o The stub is the object that the client interacts with when it
calls a method on a remote object.
o The stub takes care of marshalling (serializing) the method
arguments and sending the method invocation to the actual
remote object located in the server JVM.
3. Skeleton (deprecated in later Java versions):
o The skeleton is an object on the server side that receives
remote method calls from the stub, unmarshals the method
arguments, invokes the actual method on the remote object,
and returns the result to the client.
o In modern versions of Java (since Java 2), skeletons have been
removed, and the server-side dispatching of method calls is
handled dynamically via reflection.
4. RMI Registry:
o The RMI registry is a simple naming service that allows
clients to locate remote objects by name. It acts as a directory
for remote objects that have been registered.
o A remote object is registered with a name, and clients can
later look it up by that name to get a reference to the object.
o The registry runs on port 1099 by default.
5. Transport Layer:
o The transport layer handles the actual transmission of
messages between the client and the server. It uses TCP/IP to
ensure that messages are reliably sent and received between
JVMs.
o The transport layer manages socket connections and ensures
that remote invocations are transmitted securely over the
network.
How RMI Works
1. Client-Server Model:
o In the RMI model, the client is the program that requests
remote method invocation, and the server provides the
remote object whose methods can be invoked.
o The client interacts with the stub, which sends the request
over the network to the skeleton (or the dynamic dispatcher in
modern RMI) on the server.
2. Method Call Flow:
o Client Side:
The client invokes a method on the remote object by
calling the method on the stub.
The stub serializes the arguments, sends the request
over the network, and waits for a response.
o Server Side:
The skeleton (or dynamic dispatcher in modern RMI)
receives the method request, deserializes the
arguments, and invokes the method on the actual
remote object.
The method result or exception is serialized and sent
back to the client through the transport layer.
o Client Side (Response):
The stub receives the response, deserializes it, and
returns it to the client program.
3. Serialization:
o RMI relies heavily on serialization to transfer objects
between different JVMs. When arguments are passed to
remote methods or objects are returned, they are serialized
into a byte stream.
o The serialized byte stream is sent over the network to the
receiving JVM, where the object is reconstructed (deserialized)
into its original form.
4. RMI Registry:
o The RMI registry is essential for managing the location of
remote objects. The server binds its remote object to a name
in the registry using Naming.rebind(), and the client looks up
the object by the name using Naming.lookup().
o The registry can be started manually using the rmiregistry
command.
Components of RMI
1. Remote Interface
The remote interface defines the methods that can be invoked
remotely. It must extend the Remote interface, which is a marker
interface indicating that its methods are callable remotely.
Every method in a remote interface must declare that it throws
RemoteException, which is an exception indicating that there was a
problem with the remote communication.
Example of Remote Interface:
java
Copy code
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface MyRemoteService extends Remote {
String sayHello() throws RemoteException;
2. Remote Object Implementation
The remote object implements the remote interface and provides
the actual business logic.
The remote object is made available to clients by exporting it (using
UnicastRemoteObject), which registers the object in the RMI
runtime.
Example of Remote Object Implementation:
java
Copy code
import java.rmi.RemoteException;
import java.rmi.server.UnicastRemoteObject;
public class MyRemoteServiceImpl extends UnicastRemoteObject
implements MyRemoteService {
protected MyRemoteServiceImpl() throws RemoteException {
super();
@Override
public String sayHello() throws RemoteException {
return "Hello from remote service!";
3. Stub and Skeleton (Deprecated)
Stub: Acts as a proxy for the client. It handles the task of making
the remote call on behalf of the client and returning the response.
Skeleton (deprecated): Was used on the server side to dispatch the
request to the actual remote object and send the result back. In
modern RMI, skeletons have been removed, and this functionality is
handled using dynamic class loading and reflection.
4. Naming Class
The Naming class provides utility methods for binding and looking
up remote objects in the RMI registry.
The Naming.rebind() method binds a remote object to a name in the
registry, while Naming.lookup() retrieves the object reference based
on the name.
Example of Binding and Lookup:
java
Copy code
// Binding remote object to a name
Naming.rebind("rmi://localhost/MyRemoteService", myRemoteObject);
// Client looking up the remote object
MyRemoteService service = (MyRemoteService)
Naming.lookup("rmi://localhost/MyRemoteService");
5. Transport Layer
The transport layer handles the communication between the client
and the server. RMI uses TCP/IP sockets to transmit messages.
The transport layer is responsible for ensuring that the method call
and its result are transmitted reliably across the network.
Client Callback in RMI
RMI also supports client callbacks, which allow the server to invoke
methods on the client. This creates a two-way communication between
the client and server. A common use case is in event-driven systems,
where the server notifies the client of specific events (e.g., progress
updates, alerts).
How Client Callback Works:
1. The client implements a remote interface (callback interface) and
registers the callback with the server.
2. The client can export the callback object, allowing the server to call
methods on the client’s object.
3. The server invokes methods on the client object as needed.
Example:
Client Callback Interface:
java
Copy code
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface ClientCallback extends Remote {
void notify(String message) throws RemoteException;
Server Method Using Callback:
java
Copy code
public void registerClient(ClientCallback client) throws RemoteException {
clients.add(client); // Store client for callbacks
public void notifyClients() {
for (ClientCallback client : clients) {
client.notify("Event occurred!");
Stub Downloading in RMI
RMI supports dynamic stub downloading to allow clients to download
the stub classes at runtime. This eliminates the need to precompile and
distribute stub classes with the client code.
How Stub Downloading Works:
1. The server specifies the location (URL) of the stub classes using the
java.rmi.server.codebase system property.
2. The client, upon looking up a remote object, downloads the stub
dynamically from the specified location.
3. The stub class is loaded and used by the client to invoke remote
methods.
Example Command:
bash
Copy code
java -Djava.rmi.server.codebase=http://example.com/stubs/ Server
Common Object Request Broker Architecture (CORBA)
Introduction: CORBA (Common Object Request Broker Architecture) is a
standard defined by the Object Management Group (OMG) for
enabling communication between distributed objects in a heterogeneous
network. It provides a platform-independent, language-neutral framework
for building distributed systems. CORBA allows objects implemented in
different programming languages, on different operating systems, and on
different hardware platforms to communicate with each other.
CORBA provides a set of specifications and services to facilitate
distributed object communication. Its core component is the Object
Request Broker (ORB), which handles the communication between
client and server objects in the distributed system.
Key Components of CORBA:
1. Object Request Broker (ORB):
o The ORB is the central component in CORBA, responsible for
enabling communication between clients and objects (both
remote and local).
o It acts as an intermediary that handles the low-level details of
passing requests from a client to a server, managing object
references, and returning results.
o The ORB abstracts the complexity of network communication,
enabling distributed objects to communicate as if they were
local objects.
2. Interface Definition Language (IDL):
o CORBA uses IDL to define the interfaces of objects that will be
accessible remotely. IDL is language-independent, allowing
clients and servers to communicate regardless of their
programming language.
o The IDL defines the types of operations that objects can
perform and the input/output arguments of these operations.
o IDL is compiled by an IDL compiler to generate client and
server-side code (stub and skeleton) in the chosen
programming language.
Detailed Explanation of CORBA Components:
1. Interface in CORBA
An interface in CORBA is a definition of a set of operations that an object
can perform. The interface defines the method signatures that are
accessible remotely. These interfaces are defined using IDL (Interface
Definition Language), which is an abstract, language-independent way
to describe the operations of objects.
In CORBA:
The client calls methods on a remote object using a stub.
The server provides implementations of those methods defined in
the interface.
Example of an Interface Definition (IDL):
idl
Copy code
interface MyObject {
string sayHello(in string name);
};
In this example, the MyObject interface defines a method sayHello, which
takes a string argument and returns a string.
After defining the interface in IDL, an IDL compiler generates the
corresponding stub and skeleton code.
2. Inter-ORB Protocol (IIOP)
The Inter-ORB Protocol (IIOP) is the communication protocol used by
ORBs to exchange messages between clients and servers in a CORBA-
based distributed system. It is an Internet-based protocol that enables
ORBs running on different machines to communicate, making CORBA
platform-independent and network-independent.
Key features of IIOP:
IIOP is built on TCP/IP and defines how CORBA messages are
formatted and transmitted between ORBs.
It allows objects from different ORBs (and even different
programming languages) to communicate seamlessly.
IIOP can be used over both local and wide-area networks (LAN and
WAN).
IIOP is the core protocol of CORBA for enabling communication between
distributed objects across different ORBs.
3. Object Server and Object Client in CORBA
Object Server:
o An object server is a process that provides the actual
implementation of CORBA objects. It is the server-side entity
that implements the business logic and serves requests from
clients.
o The object server is responsible for receiving method
invocations from clients, processing them, and sending back
the results. It exposes its interfaces to clients through the
ORB.
o The server registers its objects with the ORB, which provides
clients with the necessary references to access remote
objects.
Key Responsibilities of an Object Server:
o Implements the operations defined in the IDL.
o Registers the object with the ORB, making it available for
remote invocations.
o Processes incoming requests from clients and returns
responses.
Object Client:
o An object client is a process or application that requests
services from a CORBA object.
o The client does not need to know the actual location of the
object; it only interacts with the stub, which acts as a local
representative of the remote object.
o The client sends requests to the ORB, which forwards the
request to the correct object server. The client only needs to
know the interface of the object, not its implementation or
location.
Key Responsibilities of an Object Client:
o Calls methods on remote objects via stubs.
o Sends requests to the ORB and processes the results returned
by the server.
o Interacts with the stub to perform remote method invocations.
4. Naming Service in CORBA
The Naming Service in CORBA is a directory service that helps clients
locate remote objects by name. It acts as a repository where objects can
register themselves, making them discoverable to clients.
Object Registration:
o The object server registers its objects with the Naming
Service. When an object is registered, it provides a unique
name (a logical identifier) that clients can use to access it.
Object Lookup:
o Clients use the Naming Service to search for objects by
name. Once the client looks up the object by name, the
object reference is returned, allowing the client to invoke
methods on the remote object.
In CORBA, the Naming Service is implemented as a part of the Object
Services, and it provides a mechanism for finding objects by their names.
Example:
java
Copy code
// Client code to lookup an object by name
NamingContextExt namingContext =
NamingContextExtHelper.narrow(orb.resolve_initial_references("NameServ
ice"));
Object obj = namingContext.resolve_str("MyObjectName");
5. Object Service in CORBA
CORBA defines several object services that provide additional
functionality to objects in a distributed system. These services are
implemented in the ORB and help manage various aspects of object
interaction, such as:
Transaction Service:
o Manages distributed transactions across multiple objects,
ensuring that operations are committed or rolled back in a
consistent and coordinated manner.
Security Service:
o Manages security aspects, such as authentication,
authorization, and encryption, ensuring that communications
and object invocations are secure.
Event Service:
o Supports the publish-subscribe model for distributed event
handling. It allows clients to register for events and receive
notifications when certain conditions are met.
Persistence Service:
o Manages the storage and retrieval of object states, allowing
objects to persist across system restarts.
Lifecycle Service:
o Provides mechanisms for managing the lifecycle of CORBA
objects, including creation, activation, and destruction.
Summary of CORBA Components
Component Description
Object Central component for communication between clients
Request and objects. Manages message passing, object reference
Broker (ORB) handling, and method invocation.
Defined using IDL, represents the set of operations a
Interface
CORBA object can perform. It is language-independent.
Protocol used for communication between different ORBs
Inter-ORB
over TCP/IP. Ensures cross-platform and language
Protocol (IIOP)
interoperability.
Provides the implementation of remote objects. Registers
Object Server
remote objects with the ORB for client access.
Interacts with remote objects via stubs, sending requests
Object Client
to the ORB.
Component Description
Naming Directory service that maps object names to object
Service references, allowing clients to locate remote objects.
Provides services such as transaction management,
Object Service security, event handling, persistence, and lifecycle
management.
Processes and Processors in Distributed Systems
In distributed systems, processes and processors are key elements in
ensuring the efficient execution of tasks across multiple machines.
Distributed systems leverage the parallelism and independence of
processes across different nodes to improve performance, reliability, and
fault tolerance. The way processes and processors interact in distributed
systems impacts the system's design and performance, especially in areas
like scheduling, fault tolerance, and real-time processing.
1. Threads in Distributed Systems
A thread is the smallest unit of execution within a process. In a
distributed system, threads allow processes to perform multiple
operations concurrently, which is particularly useful in handling large-
scale, parallel, or real-time applications. Distributed systems often use
multithreading to take advantage of multiple processors and improve
system responsiveness.
Multithreading in Distributed Systems:
o Each node or processor in a distributed system might execute
multiple threads, each performing a different task. Threads
within a process share the same memory space, making
communication between them faster but also requiring
synchronization.
o Multithreading allows for more efficient utilization of system
resources, such as CPU and memory, especially in systems
with multiple processors.
Thread Management:
o Thread pools are often used to limit the number of threads
created and manage their lifecycle efficiently. This reduces the
overhead of repeatedly creating and destroying threads.
o Synchronization mechanisms like mutexes and semaphores
are employed to prevent conflicts when multiple threads
access shared resources.
2. System Model in Distributed Systems
A system model in a distributed environment defines how different
components (processes, resources, and data) interact across the system.
It provides the architecture and mechanisms for communication,
coordination, and management of the distributed resources.
Centralized Model:
o In a centralized system, a central node or server controls all
the operations and coordinates communication between
different processes.
Decentralized Model:
o In this model, there is no central coordinator. Instead, each
node has its own decision-making power and responsibility for
managing resources and processes.
Hybrid Model:
o A combination of centralized and decentralized models, where
some components of the system are centralized while others
are decentralized.
The choice of the system model depends on the design requirements,
including performance, fault tolerance, and scalability.
3. Processor Allocation and Scheduling in Distributed Systems
In a distributed system, processors are allocated and scheduled
efficiently to ensure that tasks are executed in a timely manner. Processor
allocation and scheduling aim to maximize resource utilization and
minimize response time while balancing the load across different
processors.
Processor Allocation:
The allocation of processes to processors in a distributed system
should ensure that resources are used effectively and that
processing power is distributed evenly across the system.
Dynamic Allocation: Resources are assigned dynamically based on
availability and workload, ensuring that processors are not idle when
tasks are available.
Static Allocation: The allocation is predetermined based on a set
configuration and is not subject to change during runtime.
Scheduling in Distributed Systems:
Global vs Local Scheduling:
o Global scheduling involves deciding the allocation of tasks
across all processors from a central point of control.
o Local scheduling allows each node to make independent
decisions about how to schedule tasks on its own processor.
This can be more flexible but might lead to load imbalances.
Scheduling Algorithms:
o Round-robin scheduling and First-Come-First-Serve
(FCFS) are common in simple systems. More complex
algorithms such as priority-based scheduling or fair-share
scheduling are often used in real-time or performance-
sensitive environments.
4. Load Balancing and Sharing Approach
Load balancing is the process of distributing workloads evenly across
processors in a distributed system. Effective load balancing ensures that
no processor is overburdened while others remain idle, thus improving
system efficiency and performance.
Load Balancing Techniques:
Static Load Balancing:
o In this approach, the load distribution is determined
beforehand and does not change during runtime. This is
suitable for predictable workloads but is inflexible when faced
with unpredictable changes.
Dynamic Load Balancing:
o In dynamic load balancing, tasks are allocated based on the
current load of each processor, adapting to changes in system
conditions.
o Work-stealing is a common approach where a processor with
low load can "steal" tasks from a processor that is heavily
loaded.
Load Sharing Approach:
Load sharing refers to distributing tasks across different nodes
without strictly balancing the load. Here, the system can share the
load of one node with another node to ensure tasks are completed.
This is generally easier to implement but less efficient compared to
load balancing, as it does not take into account the exact capacity of
the processors.
5. Fault Tolerance in Distributed Systems
Fault tolerance is the ability of a distributed system to continue
functioning even in the presence of hardware or software failures. Since
distributed systems often involve numerous components across different
machines, faults are inevitable. Fault tolerance mechanisms ensure that
the system can detect failures and recover from them to maintain service
continuity.
Types of Faults:
Crash faults: Where a process or machine stops working.
Omission faults: When a process fails to send or receive a
message.
Timing faults: Occur when processes are out of sync or take too
long to complete a task.
Byzantine faults: When components of the system fail in arbitrary
ways, possibly providing inconsistent or erroneous outputs.
Fault Tolerance Mechanisms:
Replication: Multiple copies of data or services are maintained
across different nodes. If one replica fails, others can take over.
Checkpointing: Periodically saving the state of a process or
system. In case of a failure, the system can restart from the last
checkpoint rather than starting from scratch.
Recovery: Techniques such as rollback recovery (reversing to a
known safe state) and forward recovery (moving forward to a
valid state) are used to recover from faults.
6. Real-Time Distributed Systems
Real-time distributed systems are designed to meet specific timing
requirements, such as deadlines, in the execution of tasks. These systems
often operate in environments where timing is critical, such as in industrial
automation, healthcare systems, and aerospace applications.
Characteristics:
Timing Constraints: Tasks in real-time systems have strict
deadlines for completion. Missing a deadline can result in
catastrophic failure or a significant reduction in system quality.
Hard Real-Time vs Soft Real-Time:
o Hard real-time systems require that tasks be completed
within strict deadlines; failure to meet these deadlines is
considered a failure of the system.
o Soft real-time systems allow some flexibility, where
meeting deadlines is important but not critical for system
functioning.
Scheduling in Real-Time Systems:
Earliest Deadline First (EDF) and Rate Monotonic Scheduling
(RMS) are popular scheduling algorithms for real-time systems,
where tasks are prioritized based on their deadlines or execution
frequencies.
Preemption and synchronization techniques are crucial to ensure
that real-time tasks meet their deadlines even in a distributed
environment.
7. Process Migration and Related Issues
Process migration refers to the transfer of a process or task from one
node to another in a distributed system. It is used to improve resource
utilization, balance the load, or avoid processor overload.
Types of Process Migration:
Coarse-grained migration: Moving a whole process or thread
between nodes.
Fine-grained migration: Moving parts of a process (e.g., data or
threads) while leaving the rest behind.
Issues in Process Migration:
State Preservation: Ensuring that the state of the process is
maintained during migration. This may involve saving the process's
execution state, data, and stack.
Data Consistency: Migrating a process can cause issues with data
consistency, especially in systems where multiple processes are
accessing shared data.
Security and Authentication: Migration may introduce security
challenges, such as unauthorized access to data or processes on
different nodes.
Performance Overhead: The process of migrating can itself
introduce overhead, which needs to be minimized for efficient
system performance.