Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

unroll_and_jam/nounroll_and_jam

Enables or disables loop unrolling and jamming. These pragmas can only be applied to iterative for loops.

Syntax

#pragma unroll_and_jam

#pragma unroll_and_jam (n)

#pragma nounroll_and_jam

Arguments

n

The unrolling factor representing the number of times to unroll a loop; it must be an integer constant from 0 through 255

Description

The unroll_and_jam pragma partially unrolls one or more loops higher in the nest than the innermost loop and fuses/jams the resulting loops back together. This transformation allows more reuses in the loop.

This pragma is not effective on innermost loops. Ensure that the immediately following loop is not the innermost loop after compiler-initiated interchanges are completed.

Specifying this pragma is a hint to the compiler that the unroll and jam sequence is legal and profitable. The compiler enables this transformation whenever possible.

The unroll_and_jam pragma must precede the for statement for each for loop it affects. If n is specified, the optimizer unrolls the loop n times. If n is omitted or if it is outside the allowed range, the optimizer assigns the number of times to unroll the loop. The compiler generates correct code by comparing n and the loop count.

This pragma is supported only when compiler option O3 is set. The unroll_and_jam pragma overrides any setting of loop unrolling from the command line.

When unrolling a loop increases register pressure and code size it may be necessary to prevent unrolling of a nested loop or an imperfect nested loop. In such cases, use the nounroll_and_jam pragma. The nounroll_and_jam pragma hints to the compiler not to unroll a specified loop.

The unroll_and_jam and nounroll_and_jam pragmas are supported in host code only.

Examples

Use the unroll_and_jam pragma:

int a[10][10]; 
int b[10][10]; 
int c[10][10]; 
int d[10][10]; 
void unroll(int n) {
    int i,j,k;
    #pragma unroll_and_jam (6)
    for (i = 1; i < n; i++) {
       #pragma unroll_and_jam (6)
       for (j = 1; j < n; j++) {
          for (k = 1; k < n; k++){
            a[i][j] += b[i][k]*c[k][j];
            }
       }
    } 
}