Skip to main content

PWR074: Pass only required fields from derived type as arguments to increase code clarity

Issue

Pass only used fields from derived data types as arguments to increase code clarity.

Actions

Pass the used fields as separate arguments instead of the whole derived type.

Relevance

Derived data types are convenient constructs to group and move around related variables. While in many cases this is an effective method to organize data, it can also obscure a function's purpose and introduce unneeded dependencies.

This is specifically the case for Plain Old Data types, such as C structs or Fortran derived types that do not use the Object-Oriented features available since Fortran 2003. In C++ or Fortran code with an Object-Oriented design, a better alternative is to use encapsulation to avoid depending on the implementation of the type.

Functions having these Plain Old Data types used as arguments should make use of most if not all its fields. This promotes data hiding, makes inputs and outputs more explicit and helps to prevent unintended variable modifications.

note

This issue can also impact optimization. See check PWR012 for more details.

Code example

C

In the following example, a struct containing two arrays is passed to the foo function, which only uses one of the arrays:

// example.c
#include <stdlib.h>

typedef struct {
int A[1000];
int B[1000];
} data;

__attribute__((pure)) int foo(const data *d) {
int result = 0;
for (int i = 0; i < 1000; i++) {
result += d->A[i];
}
return result;
}

void example() {
data *d = (data *)malloc(sizeof(data));
for (int i = 0; i < 1000; i++) {
d->A[i] = d->B[i] = 1;
}
int result = foo(d);
free(d);
}

This can be easily addressed by only passing the required array and rewriting the function body accordingly:

// solution.c
#include <stdlib.h>

typedef struct {
int A[1000];
int B[1000];
} data;

__attribute__((pure)) int foo(const int *A) {
int result = 0;
for (int i = 0; i < 1000; i++) {
result += A[i];
}
return result;
}

void solution() {
data *d = (data *)malloc(sizeof(data));
for (int i = 0; i < 1000; i++) {
d->A[i] = d->B[i] = 1;
}
int result = foo(d->A);
free(d);
}

Fortran

In the following example, a derived type containing two arrays is passed to the foo function, which only uses one of the arrays:

! example.f90
program example

implicit none

type data
integer :: a(10)
integer :: b(10)
end type data

contains

pure subroutine foo(d)
implicit none
type(data), intent(in) :: d
integer :: i, sum

sum = 0
do i = 1, 10
sum = sum + d%a(i)
end do
end subroutine foo

pure subroutine bar()
implicit none
type(data) :: d
integer :: i

do i = 1, 10
d%a(i) = 1
d%b(i) = 1
end do

call foo(d)
end subroutine bar

end program example

This can be easily addressed by only passing the required array and rewriting the procedure body accordingly:

! solution.f90
program solution

implicit none

type data
integer :: a(10)
integer :: b(10)
end type data

contains

pure subroutine foo(a)
implicit none
integer, intent(in) :: a(:)
integer :: i, sum

sum = 0
do i = 1, size(a, 1)
sum = sum + a(i)
end do
end subroutine foo

pure subroutine bar()
implicit none
type(data) :: d
integer :: i

do i = 1, 10
d%a(i) = 1
d%b(i) = 1
end do

call foo(d%a)
end subroutine bar

end program solution

References