KEMBAR78
Parallel Algorithm Analysis | PDF | Parallel Computing | Scalability
0% found this document useful (0 votes)
70 views11 pages

Parallel Algorithm Analysis

A parallel algorithm executes multiple instructions simultaneously on different processors, improving computational speed for tasks involving large data sets. Key factors in analyzing parallel algorithms include time complexity, speedup, number of processors, and total cost, with Amdahl's Law highlighting limitations based on the non-parallelizable portion of a program. Scalability measures a parallel algorithm's ability to effectively utilize additional processors, while efficiency assesses how well processors are utilized during computation.

Uploaded by

hibanahm12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views11 pages

Parallel Algorithm Analysis

A parallel algorithm executes multiple instructions simultaneously on different processors, improving computational speed for tasks involving large data sets. Key factors in analyzing parallel algorithms include time complexity, speedup, number of processors, and total cost, with Amdahl's Law highlighting limitations based on the non-parallelizable portion of a program. Scalability measures a parallel algorithm's ability to effectively utilize additional processors, while efficiency assesses how well processors are utilized during computation.

Uploaded by

hibanahm12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

PARALLEL ALGORITHM

ANALYSIS
PARALLEL ALGORITHM
An algorithm is a sequence of steps that take inputs from the user and after some computation, produces an
output. A parallel algorithm is an algorithm that can execute several instructions simultaneously on different
processing devices and then combine all the individual outputs to produce the final result.

Concurrent processing is essential where the task involves processing a huge bulk of complex data. Examples
include − accessing large databases, aircraft testing, astronomical calculations, atomic and nuclear physics,
biomedical analysis, economic planning, image processing, robotics, weather forecasting, web-based services,
etc.
Parallelism is the process of processing several set of instructions simultaneously. It reduces the total
computational time. Parallelism can be implemented by using parallel computers, i.e. a computer with many
processors. Parallel computers require parallel algorithm, programming languages, compilers and operating
system that support multitasking.
🡪 Sequential Algorithm − An algorithm in which some consecutive steps of instructions are executed in a
chronological order to solve a problem.
🡪 Parallel Algorithm − The problem is divided into sub-problems and are executed in parallel to get
individual outputs. Later on, these individual outputs are combined together to get the final desired
output.
ANALYSIS OF PARALLEL ALGORITHM
Parallel algorithms are designed to improve the computation speed of a computer. For analyzing a
Parallel Algorithm, we normally consider the following parameters
🡪 Time complexity (Execution Time),
🡪 Speedup of an Algorithm
🡪 Total number of processors used, and
🡪 Total cost.
Speedup of an Algorithm
The speedup of a parallel algorithm over a corresponding sequential algorithm is the ratio of the
compute time for the sequential algorithm to the time for the parallel algorithm.
Speedup is defined as the ratio of the worst-case execution time of the fastest known sequential
algorithm for a particular problem to the worst-case execution time of the parallel algorithm.
speedup = Worst case execution time of the fastest known sequential for a particular problem
—------------------------------------------------------------------------------------------------
Worst case execution time of the parallel algorithm
Speedup of an Algorithm
It is the ratio between
time needed for the most efficient sequential algorithm to perform a computation and time needed for to
perform same computation on a machine incorporating parallelism.
Sp=Ts/Tp

Where Ts&Tp are times needed if single or P processors used


Speedup completely depends on number of processors involved in computation.
Number of processors Time to compute parallel algorithm
Goal of any parallel scalable system is to achieve a linear speedup although it is not easy.
🡪May not possible to parallelize all parts of application program.Parts that are sequential take time to
execute irrespective of number of processors used.The achievable speedup is limited by these sequential
parts.
🡪Overhead caused by initiation,synchronization and communication among processors.Overhead tends to
increase if number of processors in system increase and this puts an upper limit on speedup that can be
achieved.

🡪
Number of Processors Used
🡪 The number of processors used is an important factor in analyzing the efficiency of a
parallel algorithm.
🡪 The cost to buy, maintain, and run the computers are calculated.
🡪 Larger the number of processors used by an algorithm to solve a problem, more costly
becomes the obtained result.
🡪 But more the number of processors,maybe speed up the computation.
🡪 The cost of solving a problem on a parallel system is defined as the product of run time and
the number of processors. • A cost‐optimal parallel system solves a problem with a cost
proportional to the execution time of the fastest known sequential algorithm on a single
processor.
Total Cost
Efficiency
🡪 How effectively processors in parallel model are utilized in solving a problem.
🡪 The efficiency is defined as the ratio of speedup to the number of processors. Efficiency
measures the fraction of time for which a processor is usefully utilized
🡪 If processors are idle for long period of time or overhead occurring due to inter processor
communication is high,then efficiency is less.
Ep=Sp/p=Ts/pTp
🡪 Sp is speedup of system with p processors.
🡪 Value of Ep ranges from 0-1.
🡪 Parallel system with ideal speedup will have efficiency as 1,means system is utilised in best
possible way.
🡪 Another way of expressing maximum speedup is by using Amdahl’s law which formulates
speedup as function of amount of parallelism and computation that is inherently sequential.
Scalability
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to
effectively utilize an increasing number of processors.
•Scalability: of a parallel system is a measure of its capability to increase speedup on proportion
to the number of processors
Scalability, the ability to proportionally increase overall system capacity by adding hardware, is
important for clusters. Software scalability is also called parallelization efficiency. This is the
difference between the actual speedup you get and the speedup you get with a specific number of
processors.
Amdahl’s Law
Amdahl's Law (1967) • The speedup of a program using multiple processors in parallel computing is limited by
the time needed for the serial fraction of the problem.
Amdahl's Law evaluates the predicted system speedup if one component is enhanced.
The basic idea behind Amdahl's Law is to identify the portion of a program that can be parallelized and the portion
that must be executed sequentially. The speedup achievable by parallelizing a program is limited by the
non-parallelizable portion. The law is expressed by the following formula:

Speedup=

where:
● Speedup is the potential speedup of the parallelized system,
● f is the fraction of the program that must be executed sequentially (the non-parallelizable part),
● p is the fraction of the program that can be parallelized.
The formula suggests that as the number of processors (p) increases, the speedup approaches a limit determined
by the non-parallelizable fraction (f). In other words, even if you add more processors to a system, the overall
speedup will be limited by the sequential part of the program.
Amdahl's Law highlights the importance of identifying and optimizing the critical path in a program to achieve
meaningful speedup when parallelizing applications. It also serves as a cautionary reminder that not all programs
can be parallelized effectively, and improvements in performance may be limited by inherent sequential
dependencies.
Let's consider an example to illustrate Amdahl's Law:
Suppose we have a program with two parts: Part A, which represents 30% of the total execution time
and cannot be parallelized (sequential part), and Part B, which represents 70% of the total execution
time and can be parallelized.
f=0.3 (portion that cannot be parallelized)
p=0.7 (portion that can be parallelized)
Using Amdahl's Law formula:Speedup=1/(0.3+0.7/p)
Let's calculate the speedup for different values of p (the fraction of the program that can be
parallelized):
If we have only one processor :Speedup=1/(0.3+0.7/1) =1
This means that with one processor, there is no speedup because the sequential part still takes the same
amount of time.

If we have four processors 4( p=4):Speedup=1/(0.3+0.7/4) 1/.0475 ~=2.11

In this case, the speedup is approximately 2.11, meaning that the program could run more than twice as
fast with four processors compared to a single processor.
What is Amdahl's law used for?
Amdahl's law is a formula that is used to calculate the theoretical speedup in latency of a
system when part of the system is improved. It is used to determine the maximum performance
improvement that can be achieved by optimising a specific portion of the system.
Drawbacks
Amdahl's law has a few drawbacks. These are as follows:
● Scaling falls off when the number of processors increases. This is due to synchronization
barriers (locks) and memory collisions.
● It isn't easy to compute the value of f
● p. This is because the serializable part occurs not only in code but also in the kernel and
the hardware. Secondly, profiling is an essential part of this too.
● Consistency of the private data cache of the multiprocessor systems is also to be
considered.

You might also like