DISTRIBUTED
COMPUTING
Sunita Mahajan,
Mahajan Principal, Institute of
Computer Science, MET League of
Colleges, Mumbai
Seema Shah,
Shah Principal, Vidyalankar
Institute of Technology, Mumbai
University
© Oxford University Press 2011
Chapter - 6
Distributed System Management
© Oxford University Press 2011
Topics
• Introduction
• Resource management
• Task assignment approach
• Load balancing approach
• Load sharing approach
• Process management in a distributed
environment
• Process migration
• Threads
• Fault tolerance
© Oxford University Press 2011
Introduction
© Oxford University Press 2011
Categories of Distributed System
management
• Resource management
• Process management
• Fault tolerance
© Oxford University Press 2011
Resource Management
© Oxford University Press 2011
Process scheduling techniques
• Task assignment approach
• Load balancing approach
• Load sharing approach
© Oxford University Press 2011
Example: Google system
• Load balancing by using least loaded server
• Proximity routing
• Fault masking
© Oxford University Press 2011
Desirable features of a good global
scheduling algorithm
• No apriori knowledge about processes to be
executed
• Ability to make dynamic scheduling decisions
• Flexible
• Stable
• Scalable
• Unaffected by system failures
© Oxford University Press 2011
Task Assignment Approach
© Oxford University Press 2011
Task assignment
• Minimize IPC costs
• Less turnaround time for process completion
• High degree of parallelism
• Efficient usage of all system resources
© Oxford University Press 2011
Graph theoretic deterministic
algorithm
A system with m CPUs and n processes has any of the following
three cases:
• m=n: Each process is allocated to one CPU
• m<n: Some CPUs may remain idle or work on earlier allocated
processes
• m>n: There is a need to schedule processes on CPUs, and
several processes may be assigned to each CPU.
© Oxford University Press 2011
Example of graph theoretic
deterministic algorithm-1
• Weighted graph
– Each node is a process
– Each arc is message flowing between two
processes
© Oxford University Press 2011
Example of graph theoretic
deterministic algorithm-2
© Oxford University Press 2011
Centralized heuristic algorithm
• Also called Top down algorithm
• Allocated processing capacity fairly
2
© Oxford University Press 2011
Hierarchical algorithm
• Works between two levels in a group
• Top of the tree is truncated into a committee
which manages fault tolerance
© Oxford University Press 2011
Load Balancing Approach
© Oxford University Press 2011
Load balancing Taxonomy
• Improve resource utilization
© Oxford University Press 2011
Issues in designing in load balancing
algorithms
• Deciding policies for:
– Load estimation
– Process transfer
– Static information exchange
– Location
– Priority assignment
– Migration limitation
© Oxford University Press 2011
Policies for Load estimation
• Parameters:
– Time dependent
– Node dependent
© Oxford University Press 2011
Policies for Process transfer
• Threshold policy
– Static
– Dynamic
© Oxford University Press 2011
Location policies
• Used to select destination node
© Oxford University Press 2011
State information exchange
• Dynamic policy
• Decision based on state information
© Oxford University Press 2011
Priority assignment
• To schedule local and remote processes at a
node
© Oxford University Press 2011
Migration limiting policies
• Uncontrolled policy
• Controlled policy
© Oxford University Press 2011
Load Sharing Approach
© Oxford University Press 2011
Issues in designing load sharing
algorithms
• Load estimation policies
• Process transfer policies
• Location policies
• State information exchange policies
© Oxford University Press 2011
Location policies-1
• Decides whether sender or receiver node
process is to be migrated
© Oxford University Press 2011
Location policies-2
• Sender initiated algorithms make scheduling decisions at
process arrival epoch
• Receiver initiated algorithms make scheduling decisions at
process departure epochs
© Oxford University Press 2011
State information exchange policies
• Broadcast
• Poll
© Oxford University Press 2011
Process Management In A
Distributed Environment
© Oxford University Press 2011
Functions of distributed process
management
• Process migration
– change of location and execution of a process
from current processor to the destination
processor
© Oxford University Press 2011
Desirable features of a good process
migration mechanism
• Transparency
• Minimal interference
• Minimal residual dependencies
• Efficiency
• Robustness
• Ability to communicate between co processes
of the job
© Oxford University Press 2011
Process Migration
© Oxford University Press 2011
Steps involved in process migration
• Freezing process on the source node
• Starting process on the destination node
• Transporting process address space on
destination node
• Forward the messages addressed to migrated
processes
© Oxford University Press 2011
Mechanism
© Oxford University Press 2011
Freezing process on source node
• Blocking sequence:
– Blocking the process immediately
– Wait for I/O operations to complete and then block
the process.
• Track information about open files
• Create an empty process on the destination node
• Transfer the migrant process and address space
• Restart process on destination node
© Oxford University Press 2011
Address space transport mechanisms-1
• Process address space:
– Process state: PCB information
– Process address space: Program code, data and
stack
© Oxford University Press 2011
Address space transport mechanisms-2
• Total freezing:
– Process execution stopped during address space
transfer
© Oxford University Press 2011
Address space transport mechanisms-3
• Pre transfer:
– Address space is transferred while process continues to
run on source node
– Highest priority in scheduling
© Oxford University Press 2011
Address space transport mechanisms-4
• Transfer-on –reference:
– Process state is transferred while address space is
transferred on demand
© Oxford University Press 2011
Message forwarding
• Track and forward messages which have arrived on
source node after process migration
© Oxford University Press 2011
Handle communication between
cooperating processes
• Avoid separation of coprocesses
• Home node concept
– Deployed in Sprite system
© Oxford University Press 2011
Process migration in heterogeneous
systems
• Handling floating point
numbers
• Different sized exponents
in XDR format
• Handling overflow and
underflow
• Handling Mantissa
• Handling signed infinity
and zero representations
© Oxford University Press 2011
Advantages of process migration
• Reduce average response time of heavily
loaded nodes
• Speed up of individual jobs
• Better utilization of resources
• Improve reliability of critical processes
• Improving system security
© Oxford University Press 2011
Threads
© Oxford University Press 2011
Process v/s threads
• Analogy:
– Thread is to a process as process is to a machine
© Oxford University Press 2011
Comparison
© Oxford University Press 2011
Thread models
• Dispatcher worker model
• Team model
• Pipeline model
© Oxford University Press 2011
Thread: Dispatcher worker model
© Oxford University Press 2011
Thread: Team model
© Oxford University Press 2011
Thread: Pipeline model
© Oxford University Press 2011
Design issues in threads
• Thread semantics
– Thread creation, termination
• Thread synchronization
• Thread scheduling
© Oxford University Press 2011
Thread synchronization
• Execution in Critical region
– Use binary semaphore
© Oxford University Press 2011
Threads scheduling
• Priority assignment facility
• Choice of dynamic variation of quantum size
• Handoff scheduling scheme
• Affinity scheduling scheme
• Signals used for providing interrupts and
exceptions
© Oxford University Press 2011
Implementing thread package
• User level approach Kernel level approach
© Oxford University Press 2011
Comparison of thread implementation-1
© Oxford University Press 2011
Comparison of thread implementation-2
© Oxford University Press 2011
Threads and Remote execution
• RPC
• RMI and Java threads
© Oxford University Press 2011
RPC execution
© Oxford University Press 2011
Threads are created on the fly
© Oxford University Press 2011
Fault Tolerance
© Oxford University Press 2011
Component faults
• Transient faults
• Intermittent faults
• Permanent faults
∞
• Mean time to failure = ∑ kp (1-p) k-1
k=1
• Mean time to failure = 1/p
© Oxford University Press 2011
System failures
• Fail silent faults / fail stop faults
• Byzantine faults
© Oxford University Press 2011
Use of redundancy
• Information redundancy
• Time redundancy
• Physical redundancy
– Active replication
– Primary backup methods
© Oxford University Press 2011
Active replication-1
• State machine approach
(TMR -Triple Modular Redundancy)
© Oxford University Press 2011
Active replication-2
© Oxford University Press 2011
Primary backup
• Uses two machines :
– Primary and backup
• Uses limited number of messages such that
these messages go only to the primary server
and no ordering is required
© Oxford University Press 2011
Summary
• Introduction
• Resource management
• Task assignment approach
• Load balancing approach
• Load sharing approach
• Process management in a distributed
environment
• Process migration
• Threads
• Fault tolerance
© Oxford University Press 2011