PWR048: Replace multiplication/addition combo with an explicit call to fused multiply-add
Issue
Multiplication/addition combo can result in the compiler emitting two operations, multiplication and addition, instead of one operation fused multiply-add (FMA).
Actions
Replace a combination of multiplication and addition a + b * c
, with a call to
the fma
function.
Relevance
Modern hardware often provides a fused multiply-add (FMA) instruction that performs multiplication and addition in a single instruction. Compilers, with an ISA supporting FMA instruction, will fuse independent multiplications and additions into a single FMA operation.
Most compilers do this automatically when proper optimization flags are
provided. But, if the compiler is configured to work with strict IEEE 754
compliance, then FMA instructions will not be emitted automatically. In that
case, the developer has an option to explicitly use FMA instruction through a
function fma
available in math.h
(or std::fma
available in cmath
).
Code example
Have a look at the following code:
__attribute__((const)) double example(double a, double b, double c) {
return a + b * c;
}
In the above example, the expression a + b * c
is effectively a FMA operation
and it can be replaced with a call to fma
:
#include <math.h>
__attribute__((const)) double example(double a, double b, double c) {
return fma(b, c, a);
}
The above optimization makes sense under the following conditions:
-
The compiler is configured with strict IEEE 754 compliance (
-ffp-contract=off
or-ffp-contract=on
on GCC and clang); -
and the underlying ISA supports FMA and the compiler is allowed to use FMA instruction either using
-mfma
or-march=ARCH
, whereARCH
supports FMA instruction.