SIC 2416 Cloud Computing and Distributed Systems
Distributed Systems:
An Overview
Prof. Cheruiyot w.k, PhD
TTU
1
Distributed system, distributed computing
• Early computing was performed on a single
processor. Uni-processor computing can be
called centralized computing.
• A distributed system is a collection of
independent computers, interconnected via a
network, capable of collaborating on a task.
• Distributed computing is computing
performed in a distributed system.
01/31/2025 2
Distributed Systems
w ork
s tatio n s a lo c al n etw o r k
T h e I n ter n et
a n etw o r k h o s t
01/31/2025 3
Examples of Distributed systems
• Network of workstations (NOW): a group of
networked personal workstations connected
to one or more server machines.
• The Internet
• An intranet: a network of computers and
workstations within an organization,
segregated from the Internet via a protective
device (a firewall).
01/31/2025 4
Example of a large-scale distributed
system – eBay (Source: Los Angeles Times.)
01/31/2025 5
An example small-scale distributed system
(Source: Los Angeles Times.)
01/31/2025 6
Computers in a Distributed System
• Workstations: computers used by end-users to
perform computing
• Server machines: computers which provide
resources and services
• Personal Assistance Devices: handheld
computers connected to the system via a
wireless communication link.
01/31/2025 7
Centralized vs. Distributed Computing
term in al
m ain fram e c o m p u ter
w o rk s tatio n
n etw o rk lin k
n etw o rk h o s t
ce n tralize d com pu tin g
dis tribu te d com pu tin g
01/31/2025 8
Monolithic mainframe applications vs. distributed
applications
based on http://www.inprise.com/visibroker/papers/distributed/wp.html
• The monolithic mainframe application architecture:
– Separate, single-function applications, such as order-entry or billing
– Applications cannot share data or other resources
– Developers must create multiple instances of the same functionality
(service).
– Proprietary (user) interfaces
• The distributed application architecture:
– Integrated applications
– Applications can share resources
– A single instance of functionality (service) can be reused.
– Common user interfaces
01/31/2025 9
Distributed Systems: Intro
• Distributed System:
– Autonomous Computers + Network distributed
– computing
Communication via message-passing
– No shared memory mobile
– No global clock computing
– Range:
• Two PC’s connected by $25 worth of networking
hardware
• Beowulf clusters: racks (or stacks) of PCs connected by
high-speed networking
• Millions of computers, connected by diverse networking
technologies ranging from modems to gigabit
connections (the Internet)
10
Introduction to Distributed Systems
• Why do we develop distributed systems?
– availability of powerful yet cheap microprocessors (PCs, workstations),
continuing advances in communication technology,
11
2. Examples of Distributed Systems
• Local Area Network and Intranet
• Database Management System
• Automatic Teller Machine Network
• Internet/World-Wide Web
• Mobile and Ubiquitous Computing
12
Advantages of Distributed Systems
over Centralized Systems
• Economics: a collection of microprocessors offer a better price/performance than
mainframes. Low price/performance ratio: cost effective way to increase computing power.
• Speed: a distributed system may have more total computing power than a mainframe. Ex.
10,000 CPU chips, each running at 50 MIPS. Not possible to build 500,000 MIPS single
processor since it would require 0.002 nsec instruction cycle. Enhanced performance through
load distributing.
• Inherent distribution: Some applications are inherently distributed. Ex. a supermarket chain.
• Reliability: If one machine crashes, the system as a whole can still survive. Higher availability
and improved reliability.
• Incremental growth: Computing power can be added in small increments. Modular
expandability
• Another deriving force: the existence of large number of personal computers, the need for
people to collaborate and share information.
13
Advantages of Distributed Systems
over Independent PCs
– Data sharing: allow many users to access to a common
data base
– Resource Sharing: expensive peripherals like color printers
– Communication: enhance human-to-human
communication, e.g., email, chat
– Flexibility: spread the workload over the available
machines
14
Disadvantages of Distributed Systems
– Software: difficult to develop software for distributed
systems
– Network: saturation, lossy transmissions
– Security: easy access also applies to secrete data
15
Basic Design Issues
• General software engineering principles include
rigor and formality, separation of concerns,
modularity, abstraction, anticipation of change, …
• Specific issues for distributed systems:
– Naming
– Communication
– Software structure
– System architecture
– Workload allocation
– Consistency maintenance
16
Naming
• A name is resolved when translated into an
interpretable form for resource/object reference.
– Communication identifier (IP address + port number)
– Name resolution involves several translation steps
• Design considerations
– Choice of name space for each resource type
– Name service to resolve resource names to comm. id.
• Name services include naming context
resolution, hierarchical structure, resource
protection 17
Communication
• Separated components communicate with
sending processes and receiving processes for
data transfer and synchronization.
• Message passing: send and receive primitives
– synchronous or blocking
– asynchronous or non-blocking
– Abstractions defined: channels, sockets, ports.
• Communication patterns: client-server
communication (e.g., RPC, function shipping)
and group multicast
18
Synchronous Transmission
• The term synchronous is used to describe a continuous and consistent
timed transfer of data blocks.
• Synchronous data transmission is a data transfer method in which a
continuous stream of data signals is accompanied by timing signals
(generated by an electronic clock) to ensure that the transmitter and the
receiver are in step (synchronized) with one another. The data is sent in
blocks (called frames or packets) spaced by fixed time intervals.
• Synchronous transmission modes are used when large amounts of data
must be transferred very quickly from one location to the other. The speed
of the synchronous connection is attained by transferring data in large
blocks instead of individual characters.
19
Asynchronous Transmission
• In contrast, asynchronous transmission works
in spurts and must insert a start bit before
each data character and a stop bit at its
termination to inform the receiver where it
begins and ends.
• The term asynchronous is used to describe the
process where transmitted data is encoded
with start and stop bits, specifying the
beginning and end of each character.
20
Software Structure
• Layers in centralized computer systems:
Applications
Middleware
Operating system
Computer and Network Hardware
21
Software Structure
• Layers and dependencies in distributed systems:
Applications
Open
Distributed programming services
support
Open system kernel services
Computer and network hardware
22
SYSTEM ARCHITECTURES
• Client-Server
• Peer-to-Peer
• Services provided by multiple servers
• Proxy servers and caches
• Mobile code and mobile agents
• Network computers
• Thin clients and mobile devices
23
Clients Invoke Individual Servers
Client invocation Server
invocation
result result
Server
Client
Key:
Process: Computer:
24
Peer-to-peer Systems
Peer 2
Peer 1
Application
Application
Sharable Peer 3
objects
Application
Peer 4
Application
Peers 5 .... N
25
4.4.3 A Service by Multiple Servers
Service
Server
Client
Server
Client
Server
26
Web Proxy Server
Client Web
server
Proxy
server
Client Web
server
Proxy server is an intermediary server that retrieves data from an
Internet source, such as a webpage, on behalf of a user. They act as
additional data security boundaries protecting users from malicious
activity on the internet.
27
Web Applets
a) client request results in the downloading of applet code
Client Web
server
Applet code
b) client interacts with the applet
Web
Client Applet server
a very small application, especially a utility program performing
one or a few simple functions.
28
Thin Clients and Computer Servers
Computer server
Network computer or PC
Thin network Application
Client Process
A thin client is a computer that runs from resources stored on a central
server instead of a localized hard drive. Thin clients work by connecting
remotely to a server-based computing environment where most
applications, sensitive data, and memory, are stored.
29
Network Operating Systems
· loosely-coupled software on loosely-coupled
hardware
· A network of workstations connected by LAN
· each machine has a high degree of autonomy
· Files servers: client and server model
· Clients mount directories on file servers
· Best known network OS:
o Sun’s NFS (network file servers) for shared file
systems
· a few system-wide requirements: format and
meaning of all the messages exchanged
30
(True) Distributed Systems
tightly-coupled software on loosely-coupled hardware
provide a single-system image or a virtual
uniprocessor
a single, global interprocess communication
mechanism, process management, file system; the
same system call interface everywhere
Ideal definition:
“ A distributed system runs on a collection of computers that
do not have shared memory, yet looks like a single
computer to its users.”
31
Multiprocessor Operating Systems
· Tightly-coupled software on tightly-coupled
hardware
· Examples: high-performance servers
· shared memory
· single run queue
· traditional file system as on a single-processor
system: central block cache
32
Common Characteristics
• What are we trying to achieve when we construct a distributed
system?
• Certain common characteristics can be used to assess
distributed systems
– Heterogeneity
– Openness
– Security
– Scalability
– Failure Handling
– Concurrency
– Transparency
33
Heterogeneity
Variety and differences in
Networks
Computer hardware
Operating systems
Programming languages
Implementations by different developers
Middleware as software layers to provide a programming abstraction
as well as masking the heterogeneity of the underlying networks,
hardware, OS, and programming languages (e.g., CORBA).
Mobile Code to refer to code that can be sent from one computer to
another and run at the destination (e.g., Java applets and Java virtual
machine).
34
3.2 Openness
• Openness is concerned with extensions
and improvements of distributed systems.
• Detailed interfaces of components need
to be published.
• New components have to be integrated
with existing components.
• Differences in data representation of
interface types on different processors (of
different vendors) have to be resolved.
35
3.3 Security
In a distributed system, clients send
requests to access data managed by
servers, resources in the networks:
Doctors requesting records from hospitals
Users purchase products through electronic commerce
Security is required for:
Concealing the contents of messages: security and privacy
Identifying a remote user or other agent correctly
(authentication)
New challenges:
Denial of service attack
Security of mobile code
36
Scalability
• Adaptation of distributed systems to
– accommodate more users
– respond faster (this is the hard one)
• Usually done by adding more and/or
faster processors.
• Components should not need to be
changed when scale of a system
increases.
• Design components to be scalable! 37
3.5 Failure Handling (Fault Tolerance)
• Hardware, software and networks fail!
• Distributed systems must maintain
availability even at low levels of
hardware/software/network reliability.
• Fault tolerance is achieved by
– recovery
– redundancy
38
Concurrency
• Components in distributed systems are
executed in concurrent processes.
• Components access and update shared
resources (e.g. variables, databases,
device drivers).
• Integrity of the system may be violated if
concurrent updates are not coordinated.
– Lost updates
– Inconsistent analysis
39
Transparency
• Distributed systems should be perceived
by users and application programmers as
a whole rather than as a collection of
cooperating components.
• Transparency has different aspects.
• These represent various properties that
distributed systems should have.
40
Design Issues of Distributed Systems
• Flexibility
• Reliability
• Performance
49
Flexibility
• Make it easier to change
• Monolithic Kernel: systems calls are trapped and executed
by the kernel. All system calls are served by the kernel,
e.g., UNIX.
• Microkernel: provides minimal services.
•
1) some memory management
2) some low-level process management and scheduling
3) low-level i/o
E.g., Mach can support multiple file systems, multiple
system interfaces.
50
Reliability
• Distributed system should be more reliable
than single system.
– Availability: fraction of time the system is usable.
Redundancy improves it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover
from errors.
51
Performance
• Without gain on this, why bother with
distributed systems.
• Performance loss due to communication
delays:
– fine-grain parallelism: high degree of interaction
– coarse-grain parallelism
• Performance loss due to making the system
fault tolerant.
52
Summary
• Definitions of distributed systems and
comparisons to centralized systems.
• The characteristics of distributed systems.
• The eight forms of transparency.
• The basic design issues.
55