Skip to main content

PWD007: Unprotected multithreading recurrence

Issue

An unprotected multithreading recurrence in parallel code is causing a data race.

Actions

Protect the recurrence or execute the code sequentially if that is not possible.

Relevance

The recurrence computation pattern occurs when the same memory position is read and written to, at least once, in different iterations of a loop. It englobes both true dependencies (read-after-write) and anti-dependencies (write-after- read) across loop iterations. Sometimes the term "loop-carried dependencies" is also used. If a loop with a recurrence computation pattern is parallelized without protecting the concurrent memory access, a data race condition is introduced. In some cases, the concurrent memory access can not be protected and thus the loop can not be parallelized.

Code example

C

The following code performs an exclusive scan, naively parallelized using multithreading:

void foo(int *x, int *y, int size) {
y[0] = 0;

#pragma omp parallel for
for (int i = 1; i < size; i++) {
y[i] = y[i - 1] + x[i - 1];
}
}

This approach is incorrect due to the dependence between two consecutive iterations of the loop. Note how the value y(i) is calculated based on the previous y(i - 1).

Since OpenMP 5.0, this type of computation can be conveniently parallelized using the scan directive:

void foo(int *x, int *y, int size) {
int scan_x = 0;

#pragma omp parallel for reduction(inscan, +:scan_x)
for (int i = 0; i < size; i++) {
y[i] = scan_x;
#pragma omp scan exclusive(scan_x)
scan_x += x[i];
}
}

Fortran

The following code performs an exclusive scan, naively parallelized using multithreading:

subroutine foo(x, y)
implicit none
integer, intent(in) :: x(:)
integer, intent(inout) :: y(:)
integer :: i

y(1) = 0

!$omp parallel do
do i = 2, size(y, 1)
y(i) = y(i - 1) + x(i - 1)
end do
!$omp end parallel do
end subroutine foo

This approach is incorrect due to the dependence between two consecutive iterations of the loop. Note how the value y(i) is calculated based on the previous y(i - 1).

Since OpenMP 5.0, this type of computation can be conveniently parallelized using the scan directive:

subroutine foo(x, y)
implicit none
integer, intent(in) :: x(:)
integer, intent(out) :: y(:)
integer :: scan_x, i

scan_x = 0

!$omp parallel do reduction(inscan, +:scan_x)
do i = 1, size(y, 1)
y(i) = scan_x
!$omp scan exclusive(scan_x)
scan_x = scan_x + x(i)
end do
!$omp end parallel do
end subroutine foo

References