Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

High-Level Optimization

High-level Optimizations (HLO) exploit the properties of source code constructs (for example, loops and arrays) in applications developed in high-level programming languages. While the default optimization level, option O2, performs some high-level optimizations, specifying the O3 option provides the best chance for performing loop transformations to optimize memory accesses.

NOTE:

Loop optimizations may result in calls to library routines that can result in additional performance gain on Intel® microprocessors than on non-Intel microprocessors. Additional HLO transformations may be performed for Intel® microprocessors than for non-Intel microprocessors.

Within HLO, loop transformation techniques include:

  • Loop Permutation or Interchange

  • Loop Distribution

  • Loop Fusion

  • Loop Unrolling

  • Data Prefetching

  • Scalar Replacement

  • Unroll and Jam

  • Loop Blocking or Tiling

  • Partial-Sum Optimization

  • Predicate Optimization

  • Loop Reversal

  • Profile-Guided Loop Unrolling

  • Loop Peeling

  • Data Transformation: Malloc Combining and Memset Combining, Memory Layout Change

  • Loop Rerolling

  • Memset and Memcpy Recognition

  • Statement Sinking for Creating Perfect Loopnests

  • Multiversioning: Checks include Dependency of Memory References, and Trip Counts

  • Loop Collapsing