Visible to Intel only — GUID: GUID-C399C586-9761-4EA8-ADDB-D79F6CA8D22E
Visible to Intel only — GUID: GUID-C399C586-9761-4EA8-ADDB-D79F6CA8D22E
Performance and Large Program Considerations
IPO-Related Performance Issues
There are some general optimization guidelines for using IPO that you should keep in mind:
Using IPO on very large programs might trigger internal limits of other compiler optimization phases.
Applications where the compiler does not have sufficient intermediate representation (IR) coverage to do whole program analysis might not perform as well as those where IR information is complete.
In addition to these general guidelines, there are some practices to avoid while using IPO. The following list summarizes the activities to avoid:
Do not use the link phase of an IPO compilation using mock object files produced for your application by a different compiler. Intel® compilers cannot inspect mock object files generated by other compilers for optimization opportunities.
Update make files to call the appropriate Intel linkers when using IPO from scripts. For Linux and macOS, replace all instances of ld with xild; for Windows, replace all instances of link with xilink.
IPO for Large Programs
In most cases, IPO generates a single true object file for the link-time compilation. This behavior is not optimal for very large programs, perhaps even making it impossible to use [Q]ipo compiler option on the application.
The compiler provides two methods to avoid this problem. The first method is an automatic size-based heuristic, which causes the compiler to generate multiple true object files for large link-time compilations. The second method is to manually instruct the compiler to perform multi-object IPO.
Use the [Q]ipoN compiler option and pass an integer value in the place of N.
Use the [Q]ipo-separate compiler option.
The number of true object files generated by the link-time compilation is invisible to you unless the [Q]ipo-c or [Q]ipo-S compiler option is used.
Regardless of the method used, it is best to use the compiler defaults first and examine the results. If the defaults do not provide the desired results then experiment with generating a different number of object files.
You can use the [Q]ipo-jobs compiler option to control the number of processes, or jobs, executed during parallel IPO builds.
Use [Q]ipoN to Create Multiple Object Files
If you specify [Q]ipo0, which is the same as not specifying a value, the compiler uses heuristics to determine whether to create one or more object files based on the expected size of the application. The compiler generates one object file for small applications, and two or more object files for large applications. If you specify any value greater than 0, the compiler generates that number of object files, unless the value you pass a value that exceeds the number of source files. In that case, the compiler creates one object file for each source file then stops generating object files. The generated object files follow OS-specific naming conventions.
The following example commands demonstrate how to use [Q]ipo2 option to compile large programs.
Linux and macOS
ifort -ipo2 -c a.f90 b.f90The resulting object files are ipo_out.o, ipo_out1.o, and ipo_out2.o.
Windows
ifort /Qipo2 /c a.f90 b.f90The resulting object files are ipo_out.obj, ipo_out1.obj, and ipo_out2.obj.
Link the resulting object files as shown in Use Interprocedural Optimization or Linking Tools and Options..
Create the Maximum Number of Object Files
Using [Q]ipo-separate allows you to force the compiler to generate the maximum number of true object files that the compiler will support during multiple object compilation. The maximum number of true object files is the equal to the number of mock object files passed on the link line.
For example, you can pass example commands similar to the following:
Linux and macOS
ifort a.o b.o c.o -ipo-separate -ipo-c
Windows
ifort a.obj b.obj c.obj /Qipo-separate /Qipo-c
The compiler generates multiple object files that use the same naming convention discussed above.
Link the resulting object files as shown in Using IPO or Linking Tools and Options.
Code Layout and Multi-Object IPO
One of the optimizations performed during an IPO compilation is code layout. The analysis performed by the compiler during multi-file IPO determines a layout order for all of the routines for which it has intermediate representation (IR) information. For a multi-object IPO compilation, the compiler must tell the linker about the desired order.
The compiler first puts each routine in a named text section that varies depending on the operating system:
Linux
The first routine is placed in .text00001, the second is placed in .text00002, and so on.
Windows
The first routine is placed in .text$00001, the second is placed in .text$00002, and so on.