Scalar reduction pattern

A scalar reduction combines multiple values into one single element (the scalar reduction variable) by applying an associative, commutative operator. The pattern is often represented by a loop:

for (int i = 0; i < N; ++i) {
  v = v ⨁ ...
}

v is the scalar reduction variable and ⨁ is the associative and commutative reduction operator that guarantees that the order in which the computations are performed will not alter the final result of the reduction operation.

Code examples

C

for (int i = 0; i < N; ++i) {
  sum += A[i];
}

Fortran

do i = 1, n
  sum = sum + A(i)
end do

Parallelizing scalar reductions with OpenMP and OpenACC

The computation of the scalar reduction has concurrent read-write accesses to the scalar reduction variable. Therefore a scalar reduction can be computed in parallel safely only if additional synchronization is inserted in order to avoid race conditions associated to the reduction variable.

Scalar reductions can be parallelized in multiples ways, including:

Use a built-in OpenMP/OpenACC reduction.
Parallelize across loop iterations, but calculate the reduction within an atomic or critical region.
Parallelize the loop by creating a private copy of the reduction variable for each thread. The loop calculation is then followed by a separate reduction using an atomic operation.

Code examples​

C​

Fortran​

Parallelizing scalar reductions with OpenMP and OpenACC​

Code examples

C

Fortran

Parallelizing scalar reductions with OpenMP and OpenACC