Developer Guide and Reference

ID 767253
Date 10/31/2024
Public
Document Table of Contents

Inline Expansion of Functions

Inline function expansion does not require that the applications meet the criteria for whole program analysis normally required by IPO; so this optimization is one of the most important optimizations done in Interprocedural Optimization (IPO). For function calls that the compiler believes are frequently executed, the compiler often decides to replace the instructions of the call with code for the function itself.

In the compiler, inline function expansion is performed on relatively small user functions more often than on functions that are relatively large. This optimization improves application performance by performing the following:

  • Removing the need to set up parameters for a function call

  • Eliminating the function call branch

  • Propagating constants

Function inlining can improve execution time by removing the runtime overhead of function calls; however, function inlining can increase code size, code complexity, and compile times. In general, when you instruct the compiler to perform function inlining, the compiler can examine the source code in a much larger context, and the compiler can find more opportunities to apply optimizations.

CAUTION:

Using the Windows* options can, in some cases, significantly increase compile time and code size.

Select Routines for Inlining

The compiler attempts to select the routines whose inline expansions provide the greatest benefit to program performance. The selection is done using default heuristics.

When you use PGO with [Q]ipo, the compiler uses the following guidelines for applying heuristics:

  • The default heuristic focuses on the most frequently executed call sites, based on the profile information gathered for the program.

  • The default heuristic always inlines very small functions that meet the minimum inline criteria.

Use IPO with PGO

Combining IPO and PGO typically produces better results than using IPO alone. PGO produces dynamic profiling information that can usually provide better optimization opportunities than the static profiling information used in IPO.

The compiler uses characteristics of the source code to estimate which function calls are executed most frequently. It applies these estimates to the PGO-based guidelines described above. The estimation of frequency, based on static characteristics of the source, is not always accurate.

Inline Expansion of Library Functions

By default, the compiler automatically inlines (expands) a number of standard and math library functions at the point of the call to that function, which usually results in faster computation.

Many routines in the libirc, libm, or the svml library are more highly optimized for Intel microprocessors than for non-Intel microprocessors.

The -fno-builtin (Linux*) or the /Qno-builtin-<name> and /Oi- (Windows*) options disable inlining for intrinsic functions and disable the by-name recognition support of intrinsic functions and the resulting optimizations. The /Qno-builtin-<name> option provides the ability to disable inlining for intrinsic functions, fine-tuning the functionality of the /Oi- option, which disables almost all intrinsic functions when used. Use these options if you redefine standard library routines with your own version and your version of the routine has the same name as the standard library routine.

Inlining and Function Preemption on Linux

You must specify fpic to use function preemption. By default the compiler does not generate the position-independent code needed for preemption.

Developer-Directed Inline Expansion of User Functions

In addition to the options that support compiler directed inline expansion of user functions, the compiler also provides compiler options and pragmas that allow you to more precisely direct when and if inline function expansion should occur.

The compiler measures the relative size of a routine in an abstract value of intermediate language units, which is approximately equivalent to the number of instructions that will be generated. The compiler uses the intermediate language unit estimates to classify routines and functions as relatively small, medium, or large functions. The compiler then uses the estimates to determine when to inline a function; if the minimum criteria for inlining is met and all other things are equal, the compiler has an affinity for inlining relatively small functions and not inlining relative large functions.

Typically, the compiler targets functions that have been marked for inlining based on the following:

  • Inlining keywords: Tells the compiler to inline the specified function. For example, __inline, __forceinline.
  • Procedure-specific inlining pragmas: Tells the compiler to inline calls within the targeted procedure if it is legal to do so. For example,#pragma inline or #pragma forceinline .
  • GCC function attributes for inlining: Tells the compiler to inline the function even when no optimization level is specified. For example, __attribute__((always_inline)).

See Also