0% found this document useful (0 votes)

29 views32 pages

Parallel Programming 1

Uploaded by

attack3rx0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views32 pages

Parallel Programming 1

Uploaded by

attack3rx0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Amdahl’s Law

• If 1/s of the program is sequential, then you

can never get a speedup better than s.
– (Normalized) sequential execution time =
1/s + (1- 1/s) = 1
– Best parallel execution time on p processors =
1/s + (1 - 1/s) /p
– When p goes to infinity, parallel execution =
1/s
– Speedup = s.
Why keep something sequential?

●
Some parts of the program are not
parallelizable (because of dependencies)

●
Some parts may be parallelizable, but the
overhead dwarfs the increased speedup.
How could two statements execute in parallel?

●
On one processor:
statement 1;
statement 2;

●
On two processors:
processor1: processor2:
statement1; statement2;
Fundamental Assumption

●
Processors execute independently:
no control over order of execution
among processors
How could two statements execute in parallel?

• Possibility 1
Processor1: Processor2:
statement1;
statement2;
• Possibility 2
Processor1: Processor2:
statement2:
statement1;
How could two statements execute in parallel?

●
Their order of execution must not matter!

●
In other words,
statement1; statement2;
must be equivalent to
statement2; statement1;
Example 1
a = 1;
b = a;

●
Statements cannot be executed in parallel
●
Program modifications may make it possible.
Example 2
a = f(x);
b = a;

●
May not be wise to change the program.
Example 3
a = 1;
a = 2;

●
Statements cannot be executed in parallel.
Dependencies
●
What prevent us from parallelizing the
previous codes is the dependency between
the code statements.
Types of Dependencies
●
True dependency
●
Anti dependency
●
Output dependency
True dependence
Statements S1, S2

S2 has a true dependence on S1

iff →
S2 reads a value written by S1
Anti-dependence
Statements S1, S2.

S2 has an anti-dependence on S1
iff →
S2 writes a value read by S1.
Output Dependence
Statements S1, S2.

S2 has an output dependence on S1

iff →
S2 writes a variable written by S1.
How could two statements execute in parallel?

S1 and S2 can execute in parallel

iff →
there are no dependencies between
S1 and S2
– true dependency
– anti-dependency
– output dependency
Some dependencies can be removed.
Example 4
●
Most parallelism occurs in loops.

for(i=0; i<100; i++)

a[i] = i;

●
No dependencies
●
Iterations can be executed in parallel.
Example 5
for(i=0; i<100; i++) {
a[i] = i;
b[i] = 2*i;
}

Iterations and statements can be executed

in parallel.
Example 6

for(i=0;i<100;i++) a[i] = i;
for(i=0;i<100;i++) b[i] = 2*i;

Iterations and loops can be executed in parallel

Example 7
for(i=0; i<100; i++)
a[i] = a[i] + 100;

●
There is a dependence … on itself!
●
Loop is still parallelizable.
Example 8
for( i=0; i<100; i++ )
a[i] = f(a[i-1]);

●
Dependence between a[i] and a[i-1].
●
Loop iterations are not parallelizable.
Loop-carried dependence
●
A loop carried dependence is a
dependence that is present only if the
statements are part of the execution of a
loop.
●
Otherwise, we call it a loop-independent
dependence.
●
Loop-carried dependencies prevent loop
iteration parallelization.
Example 9
for(i=0; i<100; i++ )
for(j=1; j<100; j++ )
a[i][j] = f(a[i][j-1]);

●
Loop-independent dependence on i.
●
Loop-carried dependence on j.
●
Outer loop can be parallelized, inner
loop cannot.
Example 10
for( j=1; j<100; j++ )
for( i=0; i<100; i++ )
a[i][j] = f(a[i][j-1]);
●
Inner loop can be parallelized, outer loop
cannot.
●
Less desirable situation.
●
Loop interchange is sometimes possible.
Level of loop-carried dependence
●
Is the nesting depth of the loop that
carries the dependence.

●
Indicates which loops can be parallelized.
Be careful … Example 11
printf(“a”);
printf(“b”);

Statements have a hidden output dependence

due to the output stream.
Be careful … Example 12
a = f(x);
b = g(x);

Statements could have a hidden dependence

if f and g update the same variable.
Also depends on what f and g can do to x.
Be careful … Example 13
for(i=0; i<100; i++)
a[i+10] = f(a[i]);

●
Dependence between a[10], a[20], …
●
Dependence between a[11], a[21], …
●
…
●
Some parallel execution is possible.
Be careful … Example 14
for( i=1; i<100;i++ ) {
a[i] = …;
... = a[i-1];
}
●
Dependence between a[i] and a[i-1]
●
Complete parallel execution impossible
●
Pipelined parallel execution possible
Be careful … Example 15
for( i=0; i<100; i++ )
a[i] = f(a[indexa[i]]);

Cannot tell for sure.

●
Parallelization depends on user
knowledge of values in indexa[].
●
User can tell, compiler cannot.
Optimizations: Example 16
for (i = 0; i < 100000; i++)
a[i + 1000] = a[i] + 1;

Cannot be parallelized as it is.

May be parallelized by applying certain code transformations.
An aside
●
Parallelizing compilers analyze program
dependencies to decide parallelization.
●
In parallelization by hand, user does the
same analysis.
●
Compiler more convenient and more correct
●
User more powerful, can analyze more
patterns.
To remember
●
Statement order must not matter.
●
Statements must not have dependencies
●
Some dependencies can be removed.
●
Some dependencies may not be obvious.

2 TypesofParallelism
No ratings yet
2 TypesofParallelism
69 pages
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
No ratings yet
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
47 pages
Dependence Alanysis and Loop Normalization
No ratings yet
Dependence Alanysis and Loop Normalization
23 pages
Capp 1
No ratings yet
Capp 1
38 pages
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
No ratings yet
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
24 pages
Understanding Parallelism in Computing
No ratings yet
Understanding Parallelism in Computing
29 pages
ACA UNIT-1 B Kai Hwang
No ratings yet
ACA UNIT-1 B Kai Hwang
23 pages
CS-3006 9 DependenceAnalysis
No ratings yet
CS-3006 9 DependenceAnalysis
67 pages
Parallel Computing Dependencies
No ratings yet
Parallel Computing Dependencies
27 pages
15CS72 ACA Module1 Chapter2Final
No ratings yet
15CS72 ACA Module1 Chapter2Final
28 pages
14-Parallelization and Automatic Parallelization-08!11!2024
No ratings yet
14-Parallelization and Automatic Parallelization-08!11!2024
50 pages
PDC Lecture 04
No ratings yet
PDC Lecture 04
44 pages
Hardware vs. Software Parallelism
50% (2)
Hardware vs. Software Parallelism
55 pages
MCP Unit 1
No ratings yet
MCP Unit 1
41 pages
Instruction Level Parallelism: Soner Onder
No ratings yet
Instruction Level Parallelism: Soner Onder
25 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
98 pages
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
No ratings yet
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
38 pages
Lec5 PDF
No ratings yet
Lec5 PDF
39 pages
U3.1 Concepts and Challenges
No ratings yet
U3.1 Concepts and Challenges
12 pages
43-Instruction Scheduling and Software Pipelining-19!11!2024
No ratings yet
43-Instruction Scheduling and Software Pipelining-19!11!2024
25 pages
Understanding Parallel DOALL Loops
No ratings yet
Understanding Parallel DOALL Loops
7 pages
ACA Chapter2
No ratings yet
ACA Chapter2
66 pages
03 (Parallel Software)
No ratings yet
03 (Parallel Software)
38 pages
Module 5 Instruction Level Parallelism and Pipelining
No ratings yet
Module 5 Instruction Level Parallelism and Pipelining
54 pages
Unit 3
No ratings yet
Unit 3
49 pages
Lecture 06 - Concurrency
No ratings yet
Lecture 06 - Concurrency
36 pages
Program and Network Properties
No ratings yet
Program and Network Properties
27 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
Cosc530 Ch3all6up
No ratings yet
Cosc530 Ch3all6up
8 pages
Parallel Computing Essentials
No ratings yet
Parallel Computing Essentials
71 pages
Parallel Computing for Experts
No ratings yet
Parallel Computing for Experts
36 pages
c3 Dependence Analysis p1
No ratings yet
c3 Dependence Analysis p1
32 pages
Lecture 5
No ratings yet
Lecture 5
80 pages
Dependencies, Instruction Scheduling, Optimization, and Parallelism
No ratings yet
Dependencies, Instruction Scheduling, Optimization, and Parallelism
49 pages
CS 6290 Instruction Level Parallelism
No ratings yet
CS 6290 Instruction Level Parallelism
45 pages
Chapter 2: Program and Network Properties
No ratings yet
Chapter 2: Program and Network Properties
94 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
CompanionAsset 9780128119051 Chapter03
No ratings yet
CompanionAsset 9780128119051 Chapter03
67 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
Con Currency
No ratings yet
Con Currency
99 pages
Concurrency
No ratings yet
Concurrency
99 pages
OpenMP Performance Consideration
No ratings yet
OpenMP Performance Consideration
49 pages
Lect 02
No ratings yet
Lect 02
51 pages
Chapter 5 PPTV 41 STDV 1
No ratings yet
Chapter 5 PPTV 41 STDV 1
47 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
100 pages
01 Introduction
No ratings yet
01 Introduction
41 pages
13) Ilp1 PDF
No ratings yet
13) Ilp1 PDF
85 pages
Limits of Instruction-Level Parallelism
No ratings yet
Limits of Instruction-Level Parallelism
18 pages
Parallel Computing Essentials
No ratings yet
Parallel Computing Essentials
32 pages
Advanced Loop Parallelism Techniques
No ratings yet
Advanced Loop Parallelism Techniques
35 pages
Unit 3 Complete APP
No ratings yet
Unit 3 Complete APP
49 pages
COS 265 Parallel - Vs - Sequential - Programming
No ratings yet
COS 265 Parallel - Vs - Sequential - Programming
4 pages
Pipelining vs. Parallel Processing
No ratings yet
Pipelining vs. Parallel Processing
23 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
Grain Packing & Scheduling Ch2 Hwang
No ratings yet
Grain Packing & Scheduling Ch2 Hwang
80 pages
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
No ratings yet
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
201 pages
4 Threads and Concurrency
No ratings yet
4 Threads and Concurrency
62 pages
Lecture 7 MPP Architecture & Dependence Analysis-1
No ratings yet
Lecture 7 MPP Architecture & Dependence Analysis-1
23 pages
AS-i Safety Module Guide
No ratings yet
AS-i Safety Module Guide
6 pages
User ID Pass Word For Block Data Entry HWC
No ratings yet
User ID Pass Word For Block Data Entry HWC
4 pages
Solman Upgrade Process
No ratings yet
Solman Upgrade Process
111 pages
Chapter Summary: How Sets Work-Practical Consequences
No ratings yet
Chapter Summary: How Sets Work-Practical Consequences
2 pages
Mohammed Munawar MIT SOP Master of Networking
No ratings yet
Mohammed Munawar MIT SOP Master of Networking
3 pages
Class 12th Activity File Computer Science New
100% (1)
Class 12th Activity File Computer Science New
32 pages
(Narda) EFC-400 Software For Simulation of Electromagnetic Fields Produced by High-Voltage Lines and Electric Power Systems
No ratings yet
(Narda) EFC-400 Software For Simulation of Electromagnetic Fields Produced by High-Voltage Lines and Electric Power Systems
2 pages
Lazy Bot MT5 Daily Breakout EA
No ratings yet
Lazy Bot MT5 Daily Breakout EA
9 pages
Bharat Java
No ratings yet
Bharat Java
3 pages
Number System Number System
No ratings yet
Number System Number System
25 pages
LTE Nokia Huawei Finalized
No ratings yet
LTE Nokia Huawei Finalized
13 pages
Integrate Pentaho With MapR Using Apache Drill
No ratings yet
Integrate Pentaho With MapR Using Apache Drill
13 pages
Binary Decimal and Hexadecimal
No ratings yet
Binary Decimal and Hexadecimal
78 pages
CSE110 SUDDIN Course Outline
No ratings yet
CSE110 SUDDIN Course Outline
6 pages
Writing A Library For Arduino
No ratings yet
Writing A Library For Arduino
6 pages
Gcse Computer Science Unit1 Mark Scheme Jun19
No ratings yet
Gcse Computer Science Unit1 Mark Scheme Jun19
28 pages
Data Structures & Algorithms Intro
No ratings yet
Data Structures & Algorithms Intro
12 pages
KPI Tracking
No ratings yet
KPI Tracking
6 pages
Template Review
No ratings yet
Template Review
15 pages
AI Search Strategies Explained
No ratings yet
AI Search Strategies Explained
182 pages
Security Underwater - DSIT's AquaShield DDS
No ratings yet
Security Underwater - DSIT's AquaShield DDS
1 page
小五組 Grade 5: 時限：分鐘 Time allowed: minutes
100% (1)
小五組 Grade 5: 時限：分鐘 Time allowed: minutes
5 pages
Map QTL 5 Manual
No ratings yet
Map QTL 5 Manual
63 pages
ITSM Configuration Quick Start Guide
No ratings yet
ITSM Configuration Quick Start Guide
2 pages
Ass 2014 3
No ratings yet
Ass 2014 3
7 pages
Alta SARM Release8
No ratings yet
Alta SARM Release8
320 pages
IMEI Numbers
67% (3)
IMEI Numbers
11 pages
Advanced Database Chapter One
100% (1)
Advanced Database Chapter One
60 pages
Chapter One: Getting Started
No ratings yet
Chapter One: Getting Started
12 pages
Business Math for Slow Learners
No ratings yet
Business Math for Slow Learners
14 pages

Parallel Programming 1

Uploaded by

Parallel Programming 1

Uploaded by

Amdahl’s Law

• If 1/s of the program is sequential, then you

S2 has a true dependence on S1

S2 has an output dependence on S1

S1 and S2 can execute in parallel

for(i=0; i<100; i++)

Iterations and statements can be executed

Iterations and loops can be executed in parallel

Statements have a hidden output dependence

Statements could have a hidden dependence

Cannot tell for sure.

Cannot be parallelized as it is.

You might also like