Visible to Intel only — GUID: GUID-B7EF9B88-011D-4035-90BC-AEC3519C39C6
Visible to Intel only — GUID: GUID-B7EF9B88-011D-4035-90BC-AEC3519C39C6
Tuning Performance
This section describes several programming guidelines that can help you improve the performance of floating-point applications, including:
- Handling Floating-point Array Operations in a Loop Body
- Reducing the Impact of Subnormal Exceptions
- Avoiding Mixed Data Type Arithmetic Expressions
- Using Efficient Data Types
Floating-Point Array Operations in a Loop Body
Following the guidelines below will help auto-vectorization of the loop.
- Statements within the loop body may contain float or double operations (typically on arrays). The following arithmetic operations are supported: addition, subtraction, multiplication, division, negation, square root, MAX, MIN, and mathematical functions such as SIN and COS. Note that if fp-model set to precise or strict, leaving math -errno enabled will decrease the chances that a loop will be vectorized.
- Writing to a single-precision scalar/array and a double scalar/array within the same loop decreases the chance of auto-vectorization due to the differences in the vector length (that is, the number of elements in the vector register) between float and double types. If auto-vectorization fails, try to avoid using mixed data types.
Reduce the Impact of Subnormal Exceptions
Subnormal floating-point values are those that are too small to be represented in the normal manner; that is, the mantissa cannot be left-justified. Subnormal values require hardware or operating system interventions to handle the computation, so floating-point computations that result in subnormal values may have an adverse impact on performance.
There are several ways to handle subnormals to increase the performance of your application:
Scale the values into the normalized range
Use a higher precision data type with a larger range
Flush subnormals to zero
For example, you can translate them to normalized numbers by multiplying them using a large scalar number, doing the remaining computations in the normal space, then scaling back down to the subnormal range. Consider using this method when the small subnormal values benefit the program design.
If you change the type declaration of a variable, you might also need to change associated library calls, unless these are generic . You should verify that the gain in performance from eliminating subnormals is greater than the overhead of using a data type with higher precision and greater dynamic range.
In many cases, subnormal numbers can be treated safely as zero without adverse effects on program results. Depending on the target architecture, use flush-to-zero (FTZ) options.
Avoid Mixed Data Type Arithmetic Expressions
Avoid mixing integer and floating-point (REAL) data in the same computation. Expressing all numbers in a floating-point arithmetic expression (assignment statement) as floating-point values eliminates the need to convert data between fixed and floating-point formats. Expressing all numbers in an integer arithmetic expression as integer values also achieves this. This improves runtime performance.
For example, assuming that I and J are both INTEGER variables, expressing a constant number (2.0) as an integer value (2) eliminates the need to convert the data. The following examples demonstrate inefficient and efficient code.
Inefficient code:
INTEGER I, J I = J / 2.0
Efficient code:
INTEGER I, J I = J / 2
Use Efficient Data Types
In cases where more than one data type can be used for a variable, consider selecting the data types based on the following hierarchy, listed from most to least efficient:
Integer
Single-precision real, expressed explicitly as REAL, REAL (KIND=4), or REAL*4
Double-precision real, expressed explicitly as DOUBLE PRECISION, REAL (KIND=8), or REAL*8
Extended-precision real, expressed explicitly as REAL (KIND=16) or REAL*16
In an arithmetic expression, you should avoid mixing integer and floating-point data.