DISTRIBUTED
SYSTEMS
CHAPTER TWO
ARCHITECTURES
Leweyehu Y. 1
2.1 ARCHITECTURES
• Distributed systems are complex.
• In order to manage their intrinsic complexity, distributed
systems should be organized properly.
• Organization is mostly expressed in terms of its software
components.
• Different ways to look at organization of distributed
systems –two obvious ones:
• Software architecture – logical organization (of software
components and interconnections)
• System architecture – physical realization (the instantiation of
software components on real machines)
Leweyehu Y. 2
Distributed systems can be organized in many
different ways. We can make a distinction between
software architecture and system architecture.
The system architecture considers where the
components that constitute a distributed system are placed
across the various machines.
The software architecture is more concerned about the
logical organization of the software: how do components
interact, it what ways can they be structured, how they can
be made independent
Leweyehu Y. 3
Architectural style
A key idea when talking about architectures is architectural
style
A style reflects the basic principle that is followed in
organizing the interaction between the software
components comprising a distributed system.
• A architectural style is formulated in terms of
• Components,
• The way that components are connected to each other,
• The data exchanged between components, and finally
• How these elements are jointly configured into a system.
Leweyehu Y. 4
• A component is a modular unit with well-defined interfaces
that is replaceable within its environment.
• A connector is a mechanism that mediates communication,
coordination, or cooperation among components.
• It allows for the flow of control between components
• E.g., facilities for remote procedure call, message passing, or streaming
data.
Leweyehu Y. 5
2.1. Types of Architectural Styles
Common architectural styles of distributed systems
Layered architectures
Object-based architectures
Resource-centered architectures
Event-based architectures
Layered architectural style
components are organized in a layered fashion where a
component at layer L; is allowed to call components at the
underlying layer Li; but not the other way around.
Leweyehu Y. 6
A key observation is that control generally flows from layer to layer: requests go
down the hierarchy whereas the results flow upward
• It is hierarchical organization
• Components are organized in a layered fashion
• Component at layer Lj can make a down-call to a component at a
lower-level layer Li (with i < j) and generally expects a response.
• Only in exception, an up-call is made to higher level component
• Each layer exposes an interface to be used by above layers
• “Multi-level client-server”
• Each layer acts as a
• Server: service provider to layers “above”
• Client :service consumer of layer(s) “below”
• Communication protocol-stacks are a typical examples
• OSI Reference model
• TCP/IP
Leweyehu Y. 7
Figure 2-1. The (a) layered
Leweyehu Y.
architectural style 8
• Essentially, layered architectural style contains three logical levels
commonly know as application layers
• The application(user)-interface level
• The processing level
• The data level
Leweyehu Y. 9
Leweyehu Y. 10
Leweyehu Y. 11
Object-Based Architectures
• Components are objects
• Objects are easy to be replaced so long as the interface is not touched
• It is less structured and hence a relatively loose
organization
• The calling object might not run on the same machine
as the called object
Each object corresponds to what we have defined as a
component, and these components are connected
through a (remote) procedure call mechanism. The
layered and object based architectures still form the
most important styles for large software systems
• Connectors are RPC and RMI
Leweyehu Y. 12
Leweyehu Y. 13
Object-based architectures are attractive because they provide
a natural way of encapsulating data (called an object’s state)
and the operations that can be performed on that data (which
are referred to as an object’s methods) into a single entity.
Leweyehu Y. 14
Data-Centered Architecture
In data-centered architecture, the data is centralized and
accessed frequently by other components, which modify
data.
The main purpose of this style is to achieve integrality of
data.
Data-centered architecture consists of different components
that communicate through shared data repositories.
The components access a shared data structure and are
relatively independent, in that, they interact only through the
data store.
Leweyehu Y. 15
The most well-known examples of the data-centered
architecture is a database architecture, in which the
common database schema is created with data definition
protocol – for example, a set of related tables with fields
and data types in an RDBMS
Another example of data-centered architectures is the web
architecture which has a common data schema (i.e. meta-
structure of the Web) and follows hypermedia data model
and processes communicate through the use of shared
web-based data service
Leweyehu Y. 16
Leweyehu Y. 17
• Components
• Resources
• Components, that interact with the resource
• Connectors
• Queries
Event–based Architecture
Event-based architectures, processes essentially communicate
through the propagation of events, which optionally also carry
data.
Leweyehu Y. 18
• Event based architecture supports publish-
subscribe communication
• Publisher: components that announce data to be shared
• Subscriber: components register their interest for published
data.
• Decouples sender and receiver (asynchronous
communication)
• Both parties don’t need to be up at the time of communication
• Event can be considered as “a significant
change in state”
Leweyehu Y. 19
Leweyehu Y. 20
Event-based architectures can be combined with data-
centered architectures, yielding what is also known as
shared data spaces.
• Components:
• Can be an instance of a class or simply a module.
• Connectors:
• Event buses
Leweyehu Y. 21
2.2. System Architectures
System architecture encompasses decisions as to where to place
specific software components.
System-level architecture focuses on the entire system and the
placement of components of a distributed system across
multiple machines.
The software components, their interactions, and their placement
leads to an instance of a software architecture, also called a
system architecture.
• System architecture are of three types:
• Centralized - most components located on a single machine
• Decentralized - most machines have approximately the same
functionality
• Hybrid - some combination
Leweyehu Y. 22
Centralized Architecture
• In the basic client-server model, processes in a
distributed system are divided into two (possibly
overlapping) groups.
• Server:- is a process implementing a specific service E.g File server
• Client:-is a process that requests a service
• Clients and servers can be on different machines
• Clients follow request/reply model with respect to
using services
Leweyehu Y. 23
Figure General interaction between a client and a server.
Leweyehu Y. 24
Cont’d..
Communication between a client and a server can be implemented by :
Connectionless protocol when the underlying network is fairly reliable
like local-area networks (UDP)
Connection-oriented protocol in WANs, (TCP)
Connectionless communication is efficient
Simply packages a message for the server, identifying the service it
wants, along with the necessary input data
But, it is hard for a sender to detect if the message is successfully received
Failure of any sort means no reply
Possibilities:
Request message was lost
Reply message was lost
Server
Leweyehu Y. failed either before, during or after performing the service 25
Cont’d..
Common approach to lost request in connectionless communication:
Re-transmission (resending request )
Good for idempotent operations, i.e., operations that could be
repeated more than once without harm. E.g., “Return current value
of X”
Not good for non idempotent operations like “ increase value of x
by 100”
Because, may result in performing the operation twice
In this case reporting an error is appropriate, than resending
For these reason many distributed systems use connection-oriented protocols
Not good enough in LAN as it is slow
However, it fits the unreliable WAN environment
Example, Virtually all internet applications are based on TCP/IP
connections
Leweyehu Y. 26
Logical Architecture vs. Physical Architecture
Layer and tier are roughly equivalent terms, but
Layer typically implies software and
Tier is more likely to refer to hardware.
Logical organization is not physical organization.
Physical architecture may or may not match the logical architecture.
Meaning, logically separate components might reside on single machine
or on different machines
Clients and servers could be placed on the same node, or be distributed
according to several different topologies.
Single-Tier Architecture: dumb terminal/mainframe configuration
Two-Tier Architecture: client/single server configuration
Three-Tier Architecture: each layer on separate machine
Two-tier
Leweyehu Y.
and three-tier are the most common 27
Two-Tiered Architecture
• Where are the three application-layers placed?
• On the client machines, or on the server machines?
• A range of possible solutions:
• Thin-Client- A client machine only implements (part of) the user-
interface level
• A server machine implementing the rest, i.e, the processing and data levels
• Pros: easier to manage, more reliable, client machines don’t need to be so
large and powerful
• Con: perceived performance loss at client
• Fat-Client - All user interface, application processing and some data
resides at the client
• Pros: reduces work load at server;
• More scalable
• Cons: harder to manage by system admin,
• Less secure
• Leweyehu
Other Y.
solutions in between thin-client and fat-client (hybrid) 28
Two-tiered Architectures
Figure 2-5. Alternative client-server organizations (a)-(e).
Leweyehu Y. 29
Cont’d..
An example of the organization of Fig. 2-5(c), is that of a
word processor in which the basic editing functions
execute on the client side where they operate on locally
cached, or in-memory data. but where the advanced
support tools such as checking the spelling and grammar
execute on the server side.
In many client-server environments, the organizations
shown in Fig. 2-5(d) and Fig. 2-5(e) are particularly popular.
Leweyehu Y. 30
These organizations are used where the client machine is a
PC or workstation, connected through a network to a
distributed file system or database.
Three-tiered
• The server tier in two-tiered architecture becomes more and
more distributed
• A single server is no longer adequate for modern
information systems
• This leads to three-tiered architecture
• Server may acting as a client
• Three-tiered: each of the three layers corresponds to three
separate machines.
Leweyehu Y. 31
Figure 2-6. An example of a server acting as client
Leweyehu Y. 32
Decentralized Architectures
• Placing logically different components on different
machines is called vertical distribution(VD)
• User-interface, Processing components and a data level are on
different machine
• It is similar with the concept of vertical fragmentation in
distributed database where
• Tables are split into column wise and
distributed on different machines
• The advantage of VD is that each machine can be tailored for
specific type of function
Leweyehu Y. 33
Cont’d…..
• An alternative to VD is horizontal distribution(HD)
• A client or server may be physically split up into logically
equivalent parts.
• Each part operates on its own share of the complete data set,
• This results in balanced work load
• Again this one is similar with that of horizontal
fragmentation in distributed database where
• Tables are split row wise, and subset of rows distributed onto different
machines
• Peer-to-peer systems are a class of modern architectures that
support horizontal distribution.
• The functions that need to be carried out are represented by
every process that constitute the distributed system
Leweyehu Y. 34
Peer-to-peer systems
• P2P systems partitions tasks or work loads between peers
• Often, the processes that constitute the system are all equal
• Nodes act as both client and server;
• Much of the interaction is symmetric.
• Advantages of peer-to-peer system
• Have better load balancing
• More resistant to denial-of-service attacks,
• But, harder to manage than client-server systems.
Leweyehu Y. 35
Overlay network
• Nodes of the P2P distributed system are connected using overlay network
• It is network that is built on top of another network
• Nodes are formed by the processes of the network.
• Overlay networks in the P2P system:
• Define the structure between nodes in the system.
• Allow nodes to route requests to locations that may not be known at time of request.
• The main question for peer-to-peer system is
• How to organize the processes in an overlay network
• Their organization can be:
• Unstructured P2P, Structured P2P, Hybrid P2P
Leweyehu Y. 36
Unstructured P2P architecture
Largely relying on randomized algorithm to construct the overlay
network.
In unstructured P2P networks, nodes are connected in a random or
ad-hoc manner, without a predefined structure.
Each node has a list of neighbours, which is more or less constructed
in a random way.
One challenge is how to efficiently locate a needed data item
• The two common approaches are:
Flooding
Random walk
Leweyehu Y. 37
Cont’d…
Flooding:
Issuing node u passes request for data d to all neighbors.
Request is ignored when receiving node had seen it before.
Otherwise, v searches locally for d (recursively).
Return d if found, Otherwise forward the request to the
neighbors
However, this approach causes high signalling traffic over
the network
• May be limited by a Time-To-Live: a maximum number of hops.
Leweyehu Y. 38
Cont’d…..
Flooding is a straightforward approach to data retrieval in unstructured
P2P networks.
When a node needs to locate a data item, it sends a request message to
all of its neighbors.
Process
1. Initiation: The node that requires the data item broadcasts a request to
its immediate neighbors.
2. Propagation: Each neighbor, upon receiving the request, checks if it has
the desired data. If not, it forwards the request to its own neighbors.
3. Termination: This process continues until either the data item is found
or aLeweyehu
predetermined
Y.
number of hops (or a time limit) is reached. 39
Cont’d..
Random walk:
Random walk is a more efficient approach for data
retrieval in unstructured P2P networks.
Instead of broadcasting the request to all neighbors, the
search process involves a series of random steps.
Issuing node u passes request for d to randomly chosen
neighbor, v.
If v does not have d, it forwards request to one of its
randomly chosen neighbors, and so on.
Leweyehu Y. 40
Random walk Cont’d….
Process
1. Initiation: The node that needs the data sends a request to one
of its neighbors at random.
2. Step-by-Step: The receiving neighbor checks if it has the
data item. If it does, the search is successful. If not, it
randomly selects one of its neighbors to forward the request.
3. Iteration: This process continues for a predefined number of
steps or until the data is found.
Leweyehu Y. 41
Structured P2P
Nodes are organized following a specific distributed data structure.
In structured P2P networks, nodes are organized according to a
specific topology or structure, often using hash tables or other
indexing methods.
Structured peer-to-peer (P2P) networks are designed with a specific
architecture that organizes nodes and data in a way that facilitates
efficient resource discovery and management.
Unlike unstructured networks, where connections between nodes are
random, structured networks use algorithms and data structures to
establish predictable relationships among nodes.
Leweyehu Y. 42
Structured P2P Cont’d…
The most common one is distributed hash table (DHT)
DHT a common method for organizing and locating data.
Each node in a DHT is responsible for a portion of the data, mapped using a
hash function.
In such systems, each data item is uniquely associated with a key, in turn used as
an index.
Each node is responsible to store data that are associated with subset of
these keys.
P2P system now responsible for storing (key, value) pairs
Looking up data d with key k means routing request to node with identifier k.
Example:
• Chord: A DHT that organizes nodes in a circular manner, allowing for
efficient lookups with a logarithmic number of hops.
Leweyehu Y. 43
Cont’d...
In a structured peer-to-peer architecture, the overlay network
is constructed using a deterministic procedure.
By far the most-used procedure is to organize the
processes through a distributed hash table (DHT).
In a DHT -based system, data items are assigned a random
key from a large identifier space, such
as a 128-bit or 160-bit identifier.
Likewise, nodes in the system are also assigned a random
number from the same identifier space.
Leweyehu Y. 44
Cont’d...
The crux of every DHT-based system is then to implement
an efficient and deterministic scheme that uniquely maps
the key of a data item to the identifier of a node based on
some distance metric.
when looking up a data item, the network address of
the node responsible for that data item is returned.
Effectively, this is accomplished by routing a request for a
data item to the responsible node.
Leweyehu Y. 45
Cont’d..
Leweyehu Y. 46
Leweyehu Y. 47
Leweyehu Y. 48
Hybrid Architectures
• Many distributed systems require properties from both
client-server and peer-to-peer architectures.
• So, they put together features from both centralized and
decentralized architectures, resulting in hybrid architectures.
• Some nodes are appointed special functions in a well
organized fashion
• Examples
• Edge-server systems: placed at the edge of enterprise
network
• E.g., ISPs, which act as servers to their clients, but cooperate with
other edge servers to host shared content
Leweyehu Y. 49
Leweyehu Y. 50
END
Leweyehu Y. 51