Introduction To Parallel Programming
Introduction To Parallel Programming
salah.elfalou@gmail.com
Introduction Parallel Programming
References
2
Introduction Parallel Programming
Parallel Computing
Definition [Wikipedia]
Objectives:
I Complete computation faster
I Allocate more computing resources to a given task
I Solve larger problems
I Overcome single machine limits (speed, memory)
I Address several problems simultaneously
I Enable concurrency
I Exploit hardware resources
I Use parallel features of modern architectures
C3
Introduction Parallel Programming
f o r ( i = 0 ; i < N; i ++)
f o r ( j = i ; j < N ; j ++)
f o r ( k = i ; k < N; k ++) ∑N 2 3
i=1 i flop ≈ N /3 flop
one_flop ( ) ;
C5
Introduction Parallel Programming
C6
Introduction Parallel Programming
C6
Introduction Parallel Programming
Performance Example
Matrix Multiplication
I C = A × B with matrix size 10000 × 10000
I 10000 times computation ci,j = ∑ ai,k × bk,j
I Computational cost = 2 × 104 × 104 × 104 = 1012 flop
Execution time:
I 1947 – ENIAC: 63 years
I 1997 – PC: 24 minutes
I 1997 – ASCI Red: 2 s
I 2019 – PC: 0.2 s → 5 per second
I 2019 – Summit: < 0.02 ms → 60000 per second
C7
Introduction Parallel Programming
C8
Introduction Parallel Programming
16 GB DRAM
256 KB L2 cache
C9
Introduction Parallel Programming
Types of Parallelism
C 11
Types of Parallelism Parallel Programming
Bit-Level Parallelism
Definition [Wikipedia]
C 12
Types of Parallelism Parallel Programming
Bit-Level Parallelism
Definition [Wikipedia]
Instruction-Level Parallelism
Definition [Wikipedia]
X1=A1+B1
X2=A2+B2
X3++
X4=X1+X2
C 13
Types of Parallelism Parallel Programming
Data-Level Parallelism
Definition [Wikipedia]
1 p i x e l = BLACK ;
else i f ( row == 3 && is_odd ( column ) )
2 p i x e l = BLACK ;
Rows 3 else i f ( row > 3 && i s _ o n _ b o r d e r ( row , column ) )
p i x e l = BLACK ;
4 else
5 p i x e l = WHITE ;
C 14
Types of Parallelism Parallel Programming
Data-Level Parallelism
Definition [Wikipedia]
1 p i x e l = BLACK ;
else i f ( row == 3 && is_odd ( column ) )
2 p i x e l = BLACK ;
Rows 3 else i f ( row > 3 && i s _ o n _ b o r d e r ( row , column ) )
p i x e l = BLACK ;
4 else
5 p i x e l = WHITE ;
C 14
Types of Parallelism Parallel Programming
Task/Thread-Level Parallelism
Definition [Wikipedia]
C 15
Types of Parallelism Parallel Programming
Task/Thread-Level Parallelism
Definition [Wikipedia]
2 3 4 6 5 8 9 5
C 15
Types of Parallelism Parallel Programming
Accelerator-Level Parallelism
Definition [Wikipedia]
C 16
Types of Parallelism Parallel Programming
Accelerator-Level Parallelism
Definition [Wikipedia]
C 16
Flynn’s Taxonomy Parallel Programming
Flynn’s Taxonomy
C 17
Flynn’s Taxonomy Parallel Programming
Flynn’s Taxonomy
Definition [Wikipedia]
Classification
Single Instruction stream Multiple
C 18
Flynn’s Taxonomy Parallel Programming
Instruction memory
I Standard uniprocessor Instruction
stream
I Sequential processing
PU
I Von Neuman architecture
Data
stream
I E.g., ENIAC, Intel 8088
Data memory
C 19
Flynn’s Taxonomy Parallel Programming
Data memory
I E.g., modern processor vector
instructions, GPUs
C 20
Flynn’s Taxonomy Parallel Programming
C 21
Flynn’s Taxonomy Parallel Programming
C 22
Flynn’s Taxonomy Parallel Programming
shared variables
I Hardware and system support Interconnect
for synchronization
I Easier to program (but beware Memory
of race conditions)
I E.g., multi-core processors OpenMP
C 23
Flynn’s Taxonomy Parallel Programming