0% found this document useful (0 votes)

64 views31 pages

L2 Parallel Computing Models

The document discusses parallel computation models including PRAM and distributed models. It covers topics like parallel algorithms, complexity measures, and issues with shared memory models. The slides provide details on different parallel and distributed frameworks as well as interconnection networks and processing activation methods.

Uploaded by

Karthik Laxmikanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views31 pages

L2 Parallel Computing Models

Uploaded by

Karthik Laxmikanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Parallel Computation

Models Lecture 2

Slide 1

Parallel Computation Models

• PRAM (parallel RAM)
• Fixed Interconnection Network
– bus, ring, mesh, hypercube, shuffle-exchange
• Boolean Circuits
• Combinatorial Circuits
• BSP (Bulk Synchronous Parallel) • LogP (L:
Upper bound for the latency, O: Overhead, G –
Gap, P – No. of Processors) Slide 2
PARALLEL AND
DISTRIBUTED COMPUTATION
• MANY INTERCONNECTED PROCESSORS WORKING CONCURRENTLY

P4 P5
P3

INTERCONNECTION
NETWORK

P2
. . . . Pn P1

• CONNECTION MACHINE
the world

Slide 3

• INTERNET Connects all the computers of

TYPES OF
MULTIPROCESSING
FRAMEWORKS
PARALLEL DISTRIBUTED

TECHNICAL ASPECTS
•PARALLEL COMPUTERS (USUALLY) WORK IN TIGHT SYNCRONY, SHARE MEMORY TO A LARGE
EXTENT AND HAVE A VERY FAST AND RELIABLE COMMUNICATION MECHANISM BETWEEN
THEM.

• DISTRIBUTED COMPUTERS ARE MORE

INDEPENDENT, COMMUNICATION IS LESS FREQUENT AND LESS SYNCRONOUS,
AND THE COOPERATION IS LIMITED.

PURPOSES
• PARALLEL COMPUTERS COOPERATE TO SOLVE MORE EFFICIENTLY (POSSIBLY)
DIFFICULT PROBLEMS

• DISTRIBUTED COMPUTERS HAVE INDIVIDUAL GOALS AND PRIVATE

ACTIVITIES. SOMETIME COMMUNICATIONS WITH OTHER ONES ARE NEEDED. (E. G.
DISTRIBUTED DATA BASE OPERATIONS).

PARALLEL COMPUTERS: COOPERATION IN A POSITIVE SENSE

DISTRIBUTED COMPUTERS:
COOPERATION IN A NEGATIVE
SENSE, ONLY WHEN IT IS
NECESSARY

Slide 4
FOR PARALLEL SYSTEMS
WE ARE INTERESTED TO SOLVE ANY PROBLEM IN PARALLEL

FOR DISTRIBUTED SYSTEMS

WE ARE INTERESTED TO SOLVE IN PARALLEL

PARTICULAR PROBLEMS ONLY, TYPICAL EXAMPLES
ARE:

•COMMUNICATION SERVICES
ROUTING
BROADCASTING

•MAINTENANCE OF CONTROL STUCTURE

SPANNING TREE CONSTRUCTION
TOPOLOGY UPDATE
LEADER ELECTION

•RESOURCE CONTROL ACTIVITIES

LOAD BALANCING
MANAGING GLOBAL DIRECTORIES
Slide 5

PARALLEL ALGORITHMS

• WHICH MODEL OF COMPUTATION IS THE BETTER TO USE?

• HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL

ALGORITHM? • HOW TO CONSTRUCT EFFICIENT ALGORITHMS?

MANY CONCEPTS OF THE COMPLEXITY THEORY MUST BE REVISITED

• IS THE PARALLELISM A SOLUTION FOR HARD PROBLEMS?

• ARE THERE PROBLEMS NOT ADMITTING AN EFFICIENT PARALLEL SOLUTION,

THAT IS INHERENTLY SEQUENTIAL PROBLEMS?

Slide 6
We need a model of computation

• NETWORK (VLSI) MODEL

• The processors are connected by a network of bounded

degree. • No shared memory is available.

• Several interconnection topologies.

• Synchronous way of operating.

MESH CONNECTED ARRAY

degree = 4 (N) diameter = 2N

Slide 7

CUBE
0111

0110

HYPER
0101

0010 1110
1111

0100
1101

1010

diameter = 4
degree = 4 (log2N)

0011
1011
1000 1001
0000 0001
1100

N = 24 PROCESSORS
Slide 8
Other important topologies

• binary trees
• mesh of trees
• cube connected cycles

In the network model a PARALLEL MACHINE is a very complex

ensemble of small interconnected units, performing elementary
operations.
- Each processor has its own memory.
- Processors work synchronously.

LIMITS OF THE MODEL

• different topologies require different algorithms to solve the same

problem
• it is difficult to describe and analyse algorithms (the migration
of data have to be described)

A shared-memory model is more suitable by an algorithmic point of view

Slide 9

Model Equivalence
• given two models M1and M2, and a problem Π
of size n

• if M1and M2are equivalent then solving Π

requires:
– T(n) time and P(n) processors on M1
– T(n)O(1) time and P(n)O(1) processors on M2

Slide 10

PRAM
• Parallel Random Access Machine •
Shared-memory multiprocessor •
unlimited number of processors, each –
has unlimited local memory
– knows its ID
– able to access the shared memory
• unlimited shared memory
Slide 11

MODEL 1
P 1

2
PRAM 3

P2 .
. Common
Pi . Memory
.
P .
n

?
m
PRAM n RAM processors connected to a common memory of m cells

ASSUMPTION: at each time unit each Pi can read a memory cell, make an internal
computation and write another memory cell.

CONSEQUENCE: any pair of processor Pi Pj can communicate in constant time!

Pi writes the message in cell x at time t

Pireads the message in cell x at time t+1
Slide 12

PRAM
• Inputs/Outputs are placed in the shared
memory (designated address)
• Memory cell stores an arbitrarily large
integer
• Each instruction takes unit time •
Instructions are synchronized across the
processors

Slide 13

PRAM Instruction Set

• accumulator architecture
– memory cell R0accumulates results

• multiply/divide instructions take only

constant operands
– prevents generating exponentially large
numbers in polynomial time

Slide 14

PRAM Complexity Measures

• for each individual processor
– time: number of instructions executed
– space: number of memory cells accessed

• PRAM machine
– time: time taken by the longest running processor
– hardware: maximum number of active processors

Slide 15

Two Technical Issues for PRAM

• How processors are activated

• How shared memory is accessed

Slide 16

Processor Activation
• P0 places the number of processors (p) in the
designated shared-memory cell
– each active Pi, where i < p, starts executing
– O(1) time to activate
– all processors halt when P0 halts

• Active processors explicitly activate additional

processors via FORK instructions
– tree-like activation
– O(log p) time to activate
Slide 17
THE PRAM IS A THEORETICAL (UNFEASIBLE) MODEL
• The interconnection network between processors and memory would require
a very large amount of area .

• The message-routing on the interconnection network would require time

proportional to network size (i. e. the assumption of a constant access time
to the memory is not realistic).

WHY THE PRAM IS A REFERENCE MODEL?

• Algorithm’s designers can forget the communication problems and focus their
attention on the parallel computation only.

•There exist algorithms simulating any PRAM algorithm on bounded degree

networks.

E. G. A PRAM algorithm requiring time T(n), can be simulated in a mesh of tree in

time T(n)log2n/loglogn, that is each step can be simulated with a slow-down of
log2n/loglogn.

• Instead of design ad hoc algorithms for bounded degree networks, design more
general algorithms for the PRAM model and simulate them on a feasible network.

Slide 18
• For the PRAM model there exists a well developed body of
techniques and methods to handle different classes of computational
problems.
• The discussion on parallel model of computation is still

HOT The actual trend:

COARSE-GRAINED MODELS
•
The degree of parallelism allowed is independent
from the number of processors.

• The computation is divided in supersteps, each one includes

• local computation
• communication phase
• syncronization phase

the study is still

at the beginning!
Slide 19

Metrics
A measure of relative performance between a multiprocessor
system and a single processor system is the speed-up S( p),
defined as follows:
S( p) =Execution time using a single processor system Execution time
using a multiprocessor with p processors

S( p) =T1
TpEfficiency =Sp
p

Cost = p × Tp

Slide 20

Metrics cost-optimal: parallel cost

= sequential time
• Parallel algorithm is
Cp = T1
Ep = 100%

• Critical when
down-scaling: parallel
implementation may
become slower than
sequential
T1 = n3
Tp = n2.5 when p = n2
Cp = n4.5

Slide 21
Amdahl’s Law
• f = fraction of the problem that’s
inherently sequential
(1 – f) = fraction that’s parallel
=
• Parallel time Tp: 11
Tp= f + (1− f ) p
• Speedup processors: f
with p Sp−
f + p
Slide 22
Amdahl’s Law
• Upper bound on
speedup (p = ∞)
= S1
11Converges to 0
Sp− f =
∞ f
f Exa 2%
mple +
• p
:f=
S = 1 / 0.02 = 50

Slide 23

PRAM
• Too many interconnections gives problems with synchronization •
However it is the best conceptual model for designing efficient
parallel algorithms
– due to simplicity and possibility of simulating efficiently PRAM
algorithms on more realistic parallel architectures
Slide 24

Shared-Memory Access
Concurrent (C) means, many processors can do the operation simultaneously in
the same memory
Exclusive (E) not concurent

• EREW (Exclusive Read Exclusive Write)

• CREW (Concurrent Read Exclusive Write)
– Many processors can read simultaneously
the same location, but only one can attempt to write to a given location
• ERCW
• CRCW
– Many processors can write/read at/from the same memory location

Slide 25

Example CRCW-PRAM
• Initially
– table A contains values 0 and 1
– output contains value 0
• The program computes the “Boolean OR” of
A[1], A[2], A[3], A[4], A[5]

Slide 26

Example CREW-PRAM
• Assume initially table A contains [0,0,0,0,0,1] and we
have the parallel program
Slide
27

Pascal triangle
PRAM CREW

Slide 28

Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
Pda 3
No ratings yet
Pda 3
90 pages
Chapter 3
No ratings yet
Chapter 3
21 pages
Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Ram, Pram, and Logp Models
No ratings yet
Ram, Pram, and Logp Models
72 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
22 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
What Is Parallel Computing 1 PDF
No ratings yet
What Is Parallel Computing 1 PDF
21 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
Module 3
No ratings yet
Module 3
104 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Parallel Algorithm Main Single
No ratings yet
Parallel Algorithm Main Single
289 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Introduction to Parallel Computing
No ratings yet
Introduction to Parallel Computing
57 pages
PA Midsem
No ratings yet
PA Midsem
20 pages
Parallel Computing Essentials
No ratings yet
Parallel Computing Essentials
40 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Pdcco 1
No ratings yet
Pdcco 1
8 pages
Parallel & Distributed Computing Course Overview
No ratings yet
Parallel & Distributed Computing Course Overview
63 pages
1 Overview, Models of Computation, Brent's Theorem
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
8 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
HPC BOOk
No ratings yet
HPC BOOk
68 pages
Lecture 8 Miscellaneous Topics
No ratings yet
Lecture 8 Miscellaneous Topics
52 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Parallel Computers Architecture and Programming V. Rajaraman PDF Download
No ratings yet
Parallel Computers Architecture and Programming V. Rajaraman PDF Download
179 pages
PRAM and RAM Models Explained
No ratings yet
PRAM and RAM Models Explained
17 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Intro to Serial & Parallel Computing
No ratings yet
Intro to Serial & Parallel Computing
39 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Three
No ratings yet
Three
10 pages
BDS Session 2
No ratings yet
BDS Session 2
59 pages
002 IntroHPC
No ratings yet
002 IntroHPC
33 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Design of Parallel Algorithm'S: Faculty Guide: Group Members
No ratings yet
Design of Parallel Algorithm'S: Faculty Guide: Group Members
49 pages
V Models of Parallel Computers V. Models of Parallel Computers - After PRAM and Early Models
No ratings yet
V Models of Parallel Computers V. Models of Parallel Computers - After PRAM and Early Models
35 pages
Parallel Computing Concepts Guide
No ratings yet
Parallel Computing Concepts Guide
32 pages
Multithreading Algorithms
No ratings yet
Multithreading Algorithms
36 pages
Khaitan PSERC Webinar HPC Mar 2013 Slides
No ratings yet
Khaitan PSERC Webinar HPC Mar 2013 Slides
52 pages
Introduction to Parallel Computing
No ratings yet
Introduction to Parallel Computing
90 pages
PRAMs
No ratings yet
PRAMs
67 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Lec1 and 2
No ratings yet
Lec1 and 2
52 pages
Lecture Notes On Parallel Computation
No ratings yet
Lecture Notes On Parallel Computation
30 pages
PC 1
No ratings yet
PC 1
53 pages
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
No ratings yet
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
170 pages
UNIT - I: Parallel and Distributed Computing
No ratings yet
UNIT - I: Parallel and Distributed Computing
58 pages
Parallel Algorithms & Architectures
No ratings yet
Parallel Algorithms & Architectures
22 pages
Aca
No ratings yet
Aca
13 pages
HPC Introduction Lecture 3
No ratings yet
HPC Introduction Lecture 3
42 pages
CS621 Cheatsheet
No ratings yet
CS621 Cheatsheet
11 pages
Introduction To Parallel Computing: Solution Manual
No ratings yet
Introduction To Parallel Computing: Solution Manual
70 pages
Solution 2-DD
No ratings yet
Solution 2-DD
70 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
Course Type Course Code Name of Course L T P Credit: Text Books
No ratings yet
Course Type Course Code Name of Course L T P Credit: Text Books
1 page
Parallel Algorithm Design Basics
No ratings yet
Parallel Algorithm Design Basics
78 pages
L1 Introduction
No ratings yet
L1 Introduction
12 pages
Selection Test - 2015 Àæ Éã À Àjãpéë - 2015 Àjãpáëyðuà Ué Àæzà Éuà Àä
No ratings yet
Selection Test - 2015 Àæ Éã À Àjãpéë - 2015 Àjãpáëyðuà Ué Àæzà Éuà Àä
20 pages
Computer Organization Course
No ratings yet
Computer Organization Course
2 pages
Macabacus Manual: Applies To The Following and Subsequent Versions
No ratings yet
Macabacus Manual: Applies To The Following and Subsequent Versions
121 pages
Hard Wired
No ratings yet
Hard Wired
7 pages
1 Bac Apology WRSH
No ratings yet
1 Bac Apology WRSH
1 page
LogRhythm Schema Dictionary and Guide RevB
No ratings yet
LogRhythm Schema Dictionary and Guide RevB
226 pages
Pmath 1 2
No ratings yet
Pmath 1 2
18 pages
GEKKO Fire Alarm Control Panel
No ratings yet
GEKKO Fire Alarm Control Panel
2 pages
Automata Unit 5
No ratings yet
Automata Unit 5
64 pages
M05 - Identify and Resolve Network Problems
No ratings yet
M05 - Identify and Resolve Network Problems
38 pages
Lesson 5
No ratings yet
Lesson 5
10 pages
A Munsell Colour-Based Approach For Soil Classification Using Fuzzy Logic and Artificial Neural Networks
No ratings yet
A Munsell Colour-Based Approach For Soil Classification Using Fuzzy Logic and Artificial Neural Networks
17 pages
SAMPLE PAPER - Class 7thkamp
No ratings yet
SAMPLE PAPER - Class 7thkamp
31 pages
Ita-360 DS (022924) 20240507095014
No ratings yet
Ita-360 DS (022924) 20240507095014
2 pages
Information Security Awaremess Ans
72% (190)
Information Security Awaremess Ans
6 pages
Ece1100 Bennett Bush Resume
No ratings yet
Ece1100 Bennett Bush Resume
1 page
DevOps Q.
No ratings yet
DevOps Q.
13 pages
USB Audio/MIDI Interface: Reference Manual
No ratings yet
USB Audio/MIDI Interface: Reference Manual
24 pages
1002799.stancin Jovic
No ratings yet
1002799.stancin Jovic
6 pages
AEDTSupplemental QuickStartTutorial 3g
No ratings yet
AEDTSupplemental QuickStartTutorial 3g
44 pages
Data Analyst Cheat Sheet
No ratings yet
Data Analyst Cheat Sheet
28 pages
01 Huawei FusionServer Pro Rack Server Training PDF
No ratings yet
01 Huawei FusionServer Pro Rack Server Training PDF
30 pages
Guide Sofistik
No ratings yet
Guide Sofistik
445 pages
Netcool
No ratings yet
Netcool
39 pages
Final Draft Ict Diploma November 2024 Exam Timetable
No ratings yet
Final Draft Ict Diploma November 2024 Exam Timetable
5 pages
ATC Notes Module 5
No ratings yet
ATC Notes Module 5
23 pages
As Per Uppcl Performa 11 54 Am
No ratings yet
As Per Uppcl Performa 11 54 Am
7 pages
Msccs 1
No ratings yet
Msccs 1
1 page
Data Cleaning: Missing Values: - For Example in Attribute Income If
No ratings yet
Data Cleaning: Missing Values: - For Example in Attribute Income If
30 pages
Lab Report 2 (Muhammad Abdullah Zafar Ghauri ME-14 (C) CMS 405642
No ratings yet
Lab Report 2 (Muhammad Abdullah Zafar Ghauri ME-14 (C) CMS 405642
7 pages
HackerRank SQL Certification Roadmap
No ratings yet
HackerRank SQL Certification Roadmap
3 pages
Simplex Algorithm Explained
No ratings yet
Simplex Algorithm Explained
9 pages

L2 Parallel Computing Models

Uploaded by

L2 Parallel Computing Models

Uploaded by

Parallel Computation

Parallel Computation Models

• INTERNET Connects all the computers of

• DISTRIBUTED COMPUTERS ARE MORE

• DISTRIBUTED COMPUTERS HAVE INDIVIDUAL GOALS AND PRIVATE

PARALLEL COMPUTERS: COOPERATION IN A POSITIVE SENSE

FOR DISTRIBUTED SYSTEMS

WE ARE INTERESTED TO SOLVE IN PARALLEL

•MAINTENANCE OF CONTROL STUCTURE

•RESOURCE CONTROL ACTIVITIES

• WHICH MODEL OF COMPUTATION IS THE BETTER TO USE?

• HOW MUCH TIME WE EXPECT TO SAVE USING A PARALLEL

ALGORITHM? • HOW TO CONSTRUCT EFFICIENT ALGORITHMS?

MANY CONCEPTS OF THE COMPLEXITY THEORY MUST BE REVISITED

• IS THE PARALLELISM A SOLUTION FOR HARD PROBLEMS?

• ARE THERE PROBLEMS NOT ADMITTING AN EFFICIENT PARALLEL SOLUTION,

THAT IS INHERENTLY SEQUENTIAL PROBLEMS?

• NETWORK (VLSI) MODEL

• The processors are connected by a network of bounded

degree. • No shared memory is available.

• Several interconnection topologies.

• Synchronous way of operating.

MESH CONNECTED ARRAY

In the network model a PARALLEL MACHINE is a very complex

LIMITS OF THE MODEL

• different topologies require different algorithms to solve the same

A shared-memory model is more suitable by an algorithmic point of view

• if M1and M2are equivalent then solving Π

CONSEQUENCE: any pair of processor Pi Pj can communicate in constant time!

Pi writes the message in cell x at time t

PRAM Instruction Set

• multiply/divide instructions take only

PRAM Complexity Measures

Two Technical Issues for PRAM

• How processors are activated

• Active processors explicitly activate additional

• The message-routing on the interconnection network would require time

WHY THE PRAM IS A REFERENCE MODEL?

•There exist algorithms simulating any PRAM algorithm on bounded degree

E. G. A PRAM algorithm requiring time T(n), can be simulated in a mesh of tree in

HOT The actual trend:

• The computation is divided in supersteps, each one includes

the study is still

Metrics cost-optimal: parallel cost

• EREW (Exclusive Read Exclusive Write)

You might also like