CoSc 4038:
Introduction To Distributed
Systems
Mulugeta M.
Today’s agenda
1. Course information
2. Chapter 1: Introduction to distributed systems
2
Distributed
Course Information Systems
Course Tile: Introduction to Distributed Systems
Course Code: CoSc 4038
Credits 4 Crhrs / 6 ECTS
Contact Hours 3hrs Lecture + 1 hr Tutorial + 3hrs Lab
Pre-requisites – Data communication and computer networks
– Operating Systems
Distributed
Course Staff Systems
Instructor's Name: (Lab) Abel F.
Email: -
Office -
Office Hours:
Distributed
Overall goal of the course Systems
Acquainting students on the principles, techniques, and
practices relevant to the design and implementation of
Distributed systems.
Distributed
Learning Outcomes: Systems
After completing this course students will be able to:
Explain what a distributed system is, why they would design a system as
a distributed system, and what the desired properties of such systems
have;
Describe the challenges associated with distributed systems, and evalu
ate the effectiveness and shortcomings of their solutions;
List the principles underlying the functioning of contemporary distribut
ed systems
Build small to medium scale DS
Course Topics
• Ch1 - Introduction
• Ch2 - Architectures
• Ch3 - Processes
• Ch4 - Communication
• Ch5 - Naming
• Ch6 – Synchronization ??
Distributed
Resources Systems
Required Text book:
Distributed Systems, Principles and Paradigms, Maarten van Steen and
S. Tanenbaum, 3rd edition, Prentice Hall, 2017
Reference books:
Distributed Systems: Concepts and Design, 5th Edition, George Coulouri
s, Jean Dollimore, Tim Kindberg, and Gordon Blair, Addison-Wesley Pub
lishers, 2012
Distributed Systems, 2nd edition, S. Mullender, Addison-Wesley, 1993
Building Secure and Reliable Network Applications, K. Birman, , Mannin
g Publications Co., 1996
Distributed
Teaching- Learning Methods: Systems
The teaching- will be student-centered.
There will be:
Lecture,
Lab work,
Reading assignments and Group Discussions
Projects – You are required to work on a design/implemen
tation of a small to medium level projects.
Distributed
Evaluation (Tentative) Systems
Test #1 20%
Lab + Project 40%
Final examination 40%
Grading:
As per the university’s scale
Distributed
Rules Systems
Attending classes and labs is mandatory.
Coping other person’s work is not allowed
However, discussion is encouraged
Assignments missed their deadlines will not be accepted and
graded.
Active class participation is required.
You can ask question in class or in office with office hours
Engaging yourself in the reading assignments is mandatory
Reading assignments form part of your exams
Distributed
Disclaimer Systems
Everything is subject to change
13
Introduction to distributed systems
Chapter one
Distributed
Chapter objectives Systems
Learners should be able to understand:
What distributed system is
The role of middleware in distributed systems
Design goals of distributed systems
False assumptions of beginner DS developer
Types of distributed system
16
Distributed
What is a distributed system ? Systems
A distributed system is:
A collection of autonomous computing elements that
appears to its users as a single coherent system.
Two important aspects of this definition
independent computing elements
Users or applications perceive a single system
17
Distributed
Collection of autonomous nodes Systems
Autonomous computing elements also called nodes
Can be of either HW or SW
Behave independently of each other
Each node has its own notion of time
There is no global clock.
That leads to fundamental synchronization and coordination pro
blems.
Yet, these nodes collaborate to achieve a common goal
Realized by exchanging messages with each other
18
Distributed
Coherent system Systems
A single coherent system – users believe they are dealing with a single
system
The difference between components as well as the communication between
them are hidden from users
Users interact with the system in a uniform and consistent way
Regardless of where and when interaction takes place
Examples
An end user cannot tell where a computation is taking place
Where data is exactly stored should be irrelevant to an application
If or not data has been replicated is completely hidden
Note:- Autonomous elements have to collaborate to achieve this goal
19
Distributed
Middleware Systems
The middleware layer extends over multiple machines, and offers each application the same interface.
Goal is to hide the heterogeneity of the underlying OS and HWs
What does it contain?
Commonly used components and functions that need not be implemented by applications
separately.
Figure 1-1. A distributed system organized as middleware.
20
Distributed
Middleware Systems
Middleware is software that usually referred as the OS of distributed systems
Rational:
It manages resources for its application
It offers services that can also be found in most operating systems, including:
Facilities for inter-application communication.
Security services.
Accounting services.
Masking of and recovery from failures.
Example middleware service.
Remote Procedure Call (RPC) - allows an application to invoke a function that is im
plemented and executed on a remote computer as if it was locally available.
21
Distributed
Design Goals of distributed systems Systems
Should you build distributed systems just because you can?
Should you build distributed systems for problems that can be solved by a single machine?
NO
There are 4 goals that should be met to make building a distributed system worth the effort.
Supporting resource sharing - Easy to access and share remote resource
Making distribution transparent - Hide the fact that resources are physically distributed
across the network
Openness - The system should offer services according to standard rules that describe their
syntax and semantics
Extensible: easy to add / replace components
Scalability - Size scalable, geographically scalable, administratively scalable
22
Distributed
Resource sharing/accessibility Systems
Resources can be virtually anything,
Examples: storage facilities, data, files, services, and networks
Benefits:
Economic
It is cheaper to have a single high-end reliable storage facility be shared than having to
buy and maintain storage for each user separately.
Encourage collaboration and exchange of information
allowed geographically dispersed people work together by means of groupware software such as
Collaborative editing, teleconferencing, and so on
BitTorrent – allows users to share files across the Internet
Challenges of resource sharing is Security,
E.g. email spam, DDOS attacks
23
Distributed
Distribution Transparency Systems
DS hide the fact that its processes and resources are physically
distributed across multiple computers
DS that is able to present itself to users and applications as if it
were only a single computer system is said to be transparent
that is, invisible, to end users and applications.
The concept of transparency can be applied to several aspects
of a distributed system,
the most important ones are shown on the next slide
24
Distributed
Types of Transparency in a Distributed System Systems
Transparency Description
Access Hide differences in data representation and how an object is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several independent users simultaneously
Failure Hide the failure and recovery of an object
Figure 1-2. Different forms of transparency in a distributed system (ISO, 1995) .
The term object refers to either process or resource
25
Distributed
Openness Systems
An open distributed system is essentially a system that offers
components that can easily be used by, or integrated into other systems.
It often consists of component that originate from somewhere as well
How?
Components adhere to standard rules that describe the syntax and semantics
of what services those components have to offer
Define services through Interfaces using interface definition language(IDL)
It captures only the syntax of services like function name, parameter, return
type etc
Semantics (what those services do) specified in natural language
27
Distributed
Openness Systems
By openness we want to achieve
Interoperability
Implementations from different manufacturers can work together by merely relying
on the standard rules
Portability
Applications from one distributed system can be executed on another distributed
system that implements the same interface
Extensibility
Easy to add or replaces components in the system
Flexibility
Easy to modify/customize the system/component to a specific need
Flexibility is achieved by separating policy from mechanism
28
Distributed
Scalability Systems
Scalability is the ability of a system, network, or process to handle a growing amount of
work in a capable manner(with no significant loss of performance)
Measured in three dimensions
Size scalable
Can easily add more users or resources to the system
Geographically scalable
Can easily handle users and resources that may lie far apart
Administratively scalable
Can easily be managed even if it spans many independent administrative organizations
Observation
Most systems account only for size scalability
Often solved by : multiple powerful servers operating independently in parallel
30
Distributed
Size Scalability problems Systems
When services are implemented/running by means of a single
/few/ tightly coupled servers in the distributed system.
the server, or group of servers, can simply become a bottleneck
when it needs to process an increasing number of requests.
Root causes for size scalability problems with centralized solutions
The computational capacity, limited by the CPUs
The storage capacity, including the transfer rate between CPUs and disks
The network between the user and the centralized service
31
Distributed
Problems with geographical scalability Systems
It is difficult to scale existing distributed systems that were designed for
local-area networks into WAN
Many distributed systems assume synchronous client-server interactions:
client sends request and waits for an answer.
Latency may easily prohibit this scheme.
WAN links are often inherently unreliable:
The effect is that solutions developed for local-area networks cannot always work
on wide-area system
Example, simply moving streaming video from LAN to WAN is bound to fail.
Lack of multipoint communication(like broadcast),
Unlike LAN, a simple search broadcast cannot be deployed.
Solution is to develop separate naming and directory services to find hosts and resources
33
Distributed
Problems with administrative scalability Systems
How to scale distributed system across multiple, independent administrative dom
ains?
Issues: Administrative entities have conflicting policies concerning
usage (and thus payment), management, and security
Examples
Grid computing: share expensive resources between different domains.
Exception: several peer-to-peer networks
File-sharing systems (based, e.g., on BitTorrent)
Peer-to-peer telephony (Skype, Zoom)
Note: in such systems end users collaborate and not administrative entities.
34
Distributed
Scaling techniques Systems
So, how these scalability problems can generally be solved ?
There are basically three techniques for scaling:
1) Hiding communication latencies,
2) Partitioning and Distribution
3) Replication
35
Distributed
Scaling techniques Systems
1) Hiding communication latencies,
It is applicable in the case of geographical scalability.
The basic idea: try to avoid waiting for responses to remote-service requests as much as
possible.
The possible solutions:
Make use of asynchronous communication
Make request and do other useful work up until the response turned in
Problem: not every application fits this model.
Reduce the overall communication between client and server
Move computations to client
See next slide
36
Scaling Techniques (1)- Hide latency
Figure 1-4. The difference between letting (a) a server or (b) a client
check forms as they are being filled.
37
Distributed
Scaling Systems
2) Partitioning and Distribution
Involves taking a component, splitting it into smaller parts, and
subsequently spreading those parts across the system.
Examples:
WWW: Web documents are physically distributed across
millions of machines
Domain Name System (DNS): name resolution is carried out
by multiple machines
See next slide
38
Scaling Techniques (2)- Distribution
Figure 1-5. An example of dividing the DNS
name space into zones.
39
Distributed
Scaling techniques Systems
3) Redundancy/Replication
Is to make copies of data available at different machines
Replication has the following advantages
Hide much of the communication latency problems mentioned
Improves availability
Helps to balance the load between components leading to better performance
Caching is a special form of replication
Caching is a decision made by the client of a resource and not by the owner of a resource
Drawback
Caching and replication leads to consistency problems
40
Pitfalls when Developing Distributed Systems
• Distributed systems differ from traditional software because components are dispersed
across a network.
• Not taking this dispersion into account during design causes problem
• False assumptions made by first time DS developer:
• The network is reliable. • Latency is zero.
• The network is secure. • Bandwidth is infinite.
• The network is homogeneous. • Transport cost is zero.
• The topology does not change. • There is one administrator.
Note:-
• When one of these fail, it is difficult to mask unwanted behavior
• Most of these issues will not most likely show up in non-distributed applications development
41
Distributed
Types of distributed systems Systems
Three types of distributed systems
High performance distributed computing systems
Distributed information systems
Distributed systems for pervasive computing
42
Distributed
Distributed Computing Systems Systems
Used for high performance computing tasks
Examples of distributed computing systems
Cluster computing systems
Grid computing systems
Cloud computing systems
43
Distributed
Cluster Computing Systems Systems
In virtually all cases, cluster computing is used for parallel
programming in which a single (compute intensive) program is
run in parallel on multiple machines.
A typical cluster system consists of a collection of compute nodes that
are controlled and accessed by means of a single master node.
Nodes are connected through a LAN.
Nodes are essentially homogeneous (Same OS, near-identical hardware )
The master node typically handles
The allocation of nodes to a particular parallel program,
Maintains a batch queue of submitted jobs, and
Provides an interface for the users of the system.
44
Distributed
Example cluster configuration Systems
General configuration of Linux-based cluster called Beowulf
Figure. An example of a cluster computing system.
Note:- there is also another configuration (symmetric approach) of Cluster system
where there is no master node. A good example is MOSIX
45
Distributed
Grid Computing Systems Systems
In contrast to cluster computing, grid computing
Have a high degree of heterogeneity
Dispersed across several organizations
Can easily span a wide-area network
The goal is to bring users and resources from different org
anizations together to allow collaboration among them
Establish Virtual Organization (VO)
Members belong to the same VO has access to the resources
that are provided to that VO
46
Distributed
Grid Computing Systems (2) Systems
Focus of the software design for grid computing
Providing access to resources from different administrative
domain to only those users that belong to a specific VO
Typically, resources are
Compute servers (supercomputers, cluster computers),
Storage facilities and Databases.
Networked telescopes, sensors, etc
47
Distributed
Cloud computing Systems
Provides computing services and resources (hardware and s
oftware) over a network/internet
Cloud computing is based upon the concept of utility computing
Customers shall pay only based on a pay-per-use model
It is characterized by an easily usable and accessible pool of virtualized
resources
It is scalable as users can get more resources if more work needs to
be done
48
Distributed
Cloud computing Systems
In practice, clouds are organized into four layers
Hardware – contains resources customers never get to see directly.
Infrastructure - Employs virtualization techniques to provide customers virtu
al storage and computing resources
Platform – provides API for developing Apps, Storage
Application – Actual applications like suite of apps shipped with OSes
49
Distributed
Cloud computing Systems
Cloud-computing providers offer these layers to their customers through various
interfaces
Command-line Tools, Programming interface(API) and web interface
Cloud computing providers offer their services according to three fundamental
models
Infrastructure as a service (IaaS): Covering hardware and infrastructure layer
Basic infrastructure like storage
Platform as a service (PaaS): Covering the platform layer
Database, web servers
Software as a service (SaaS): Covering application layer
Software that the clients need like text processor
50
Distributed
Issues of cloud computing Systems
Cloud computing is becoming so popular and common
It allows organizations to outsource their IT infrastructure:
hardware and software
Certainly a serious alternative to maintaining huge local infrastructures
However, it has certain issues to resolve
Provider lock-in,
Security and privacy issues, and
Dependency on the availability of services
51
Distributed
2. Distributed Information Systems Systems
Enterprises might already have multiple networked information systems
Example: a University might have Registrar system, HRM system etc….
But, integrating applications into enterprise wide information system was painful
Middleware is the solution
Applications can be integrated at different levels
Database level
That results in Distributed Transaction Processing
Application level
Enterprise Application Integration (EAI)
52
Distributed
Distributed transaction processing Systems
Operations on a database are usually carried out in the form of
transactions.
A nested transaction is constructed from a number of sub-transactions
The sub transactions could run in parallel on different machines to gain perfor
mance
- Application integration at database
level is co-ordinated by transaction
processing monitoring (TP Monitor)
53
Distributed
Transaction Processing Monitor (TP monitor) Systems
- TP monitor is a middleware
- Its main task is to allow application to access multiple servers/databases
- Clients combine requests for (different) servers; send that off; collect responses, and
present a coherent result to the user.
54
Distributed
Enterprise Application Integration Systems
For applications decoupled from the databases they were built upon, facilities were
needed to integrate applications independent from their databases.
The Idea here is application’s component can directly communicate with one another.
Inter application communication leads to different communication models
Remote procedure call or Remote Method Invocation
Requests are sent through local procedure call, packaged as message, processed
and result returned as message from call.
The disadvantage is that both caller and callee must be up and running at the tim
e of communication
Message Oriented Middleware (MOM) (Publish/Subscribe model)
Messages are sent to logical contact point (published), and forwarded to subscrib
ed applications
55
Distributed
Enterprise Application Integration Systems
Middleware as a communication facilitator in enterprise application integration.
56
Distributed
3. Distributed Pervasive System Systems
It is emerging next-generation of distributed systems
Goal:
Spread a real-life environment with a large variety of smart devices
Devices in a distributed pervasive system are often:
Small, battery-powered, mobile, and only wireless communication
They general lack of human administrative control
Can be configured by their owners or
They need to automatically discover their environment and "nestle in" as best
as possible.
57
Distributed
Example - Electronic Health Care Systems Systems
Are meant to monitor the well-being of individuals and to automatic
ally contact physicians when needed.
Goal is to prevent people from being hospitalized
Such systems should be equipped with various sensors organized in
a (wireless) body-area network(BAN).
A network should at worst only minimally hinder a person
The network should be able to operate while a person is moving
Two obvious organizations of BAN are:
A local hub
A continuous wireless connection
58
Distributed
Electronic Health Care Systems (2) Systems
Figure 1-12. Monitoring a person in a pervasive electronic health care system, using:
(a) a local hub or
(b) a continuous wireless connection.
59
Distributed
Types of pervasive systems Systems
Ubiquitous computing systems: there is a continuous interaction between s
ystem and user.
Distraction free interaction
Devices are highly context aware
Mobile computing systems: pervasive, but emphasis is on the fact that devices
are inherently mobile.
Discovery of local services
Sensor (and actuator) networks: pervasive, with emphasis on the actual (collab
orative) sensing and actuation of the environment.
Example: automatic activation of sprinklers when a fire
has been detected.
60
End of Chapter1
61