Automatic Parallelization - 2
Y.N. Srikant
Department of Computer Science
Indian Institute of Science
Bangalore 560 012
NPTEL Course on Principles of Compiler Design
Y.N. Srikant Automatic Parallelization
Data Dependence Relations
Y.N. Srikant Automatic Parallelization
Data Dependence Direction Vector
Data dependence relations are augmented with a direction
of data dependence (direction vector)
There is one direction vector component for each loop in a
nest of loops
The data dependence direction vector (or direction vector)
is Ψ = (Ψ1 , Ψ2 , ..., Ψd ), where Ψk ∈ {<, =, >, ≤, ≥, 6=, ∗}
Forward or “<” direction means dependence from iteration i
to i + k (i.e., computed in iteration i and used in iteration
i + k)
Backward or “>” direction means dependence from
iteration i to i − k (i.e., computed in iteration i and used in
iteration i − k ). This is not possible in single loops and
possible in two or higher levels of nesting
Equal or “=” direction means that dependence is in the
same iteration (i.e., computed in iteration i and used in
iteration i)
Y.N. Srikant Automatic Parallelization
Direction Vector Example 1
Y.N. Srikant Automatic Parallelization
Direction Vector Example 2
Y.N. Srikant Automatic Parallelization
Direction Vector Example 3
Y.N. Srikant Automatic Parallelization
Direction Vector Example 4
Y.N. Srikant Automatic Parallelization
Data Dependence Graph and Vectorization
Individual nodes are statements of the program and edges
depict data dependence among the statements
If the DDG is acyclic, then vectorization of the program is
possible and is straightforward
Vector code generation can be done using a topological
sort order on the DDG
Otherwise, find all the strongly connected components of
the DDG, and reduce the DDG to an acyclic graph by
treating each SCC as a single node
SCCs cannot be fully vectorized; the final code will contain
some sequential loops and possibly some vector code
Y.N. Srikant Automatic Parallelization
Data Dependence Graph and Vectorization
If all the dependence relations in a loop nest have a
direction vector value of “=” for a loop, then the iterations of
that loop can be executed in parallel with no
synchronization between iterations
Any dependence with a forward (<) direction in an outer
loop will be satisfied by the serial execution of the outer
loop
If an outer loop L is run in sequential mode, then all the
dependences with a forward (<) direction at the outer level
(of L) will be automatically satisfied (even those of the
loops inner to L)
However, this is not true for those dependences with with
(=) direction at the outer level; the dependences of the
inner loops will have to be satisfied by appropriate
statement ordering and loop execution order
Y.N. Srikant Automatic Parallelization
Vectorization Example 1
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.1
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.2
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.3
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.4
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.5
Y.N. Srikant Automatic Parallelization
Vectorization Example 2.6
Y.N. Srikant Automatic Parallelization
Concurrentization Examples
Y.N. Srikant Automatic Parallelization
Loop Transformations for increasing Parallelism
Recurrence breaking
Ignorable cycles
Scalar expansion
Scalar renaming
Node splitting
Threshold detection and index set splitting
If-conversion
Loop interchanging
Loop fission
Loop fusion
Y.N. Srikant Automatic Parallelization
Scalar Expansion
Y.N. Srikant Automatic Parallelization
Scalar Expansion is not always profitable
Y.N. Srikant Automatic Parallelization
Scalar Renaming
Y.N. Srikant Automatic Parallelization
If-Conversion
Y.N. Srikant Automatic Parallelization
Loop Interchange
For machines with vector instructions, inner loops are
preferrable for vectorization, and loops can be
interchanged to enable this
For multi-core and multi-processor machines, parallel outer
loops are preferred and loop interchange may help to make
this happen
Requirements for simple loop interchange
1 The loops L1 and L2 must be tightly nested (no statements
between loops)
2 The loop limits of L2 must be invariant in L1
3 There are no statements Sv and Sw (not necessarily
∗
distinct) in L1 with a dependence Sv δ(<,>) Sw
Y.N. Srikant Automatic Parallelization
Loop Interchange for Vectorizability
Y.N. Srikant Automatic Parallelization
Loop Interchange for parallelizability
Y.N. Srikant Automatic Parallelization
Legal Loop Interchange
Y.N. Srikant Automatic Parallelization
Illegal Loop Interchange
Y.N. Srikant Automatic Parallelization
Legal but not beneficial Loop Interchange
Y.N. Srikant Automatic Parallelization
Loop Fission - Motivation
Y.N. Srikant Automatic Parallelization
Loop Fission: Legal and Illegal
Y.N. Srikant Automatic Parallelization