13.7. Intel® HLS Compiler Pro Edition Loop Pragmas

Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

Download PDF

ID 683349

Date 10/02/2023

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: svj1548359147169

Ixiasoft

View Details

13.7. Intel® HLS Compiler Pro Edition Loop Pragmas

Use the Intel® HLS Compiler loop pragmas to control how the compiler pipelines the loops in your component.

Table 43. Intel® HLS Compiler Pro Edition Loop Pragmas Summary
Pragma	Description
`disable_loop_pipelining`	Prevents compiler from pipelining a loop,
`ii`	Forces a loop to have a loop initiation interval (II) of a specified value.
`ivdep`	Ignores memory dependencies between iterations of this loop.
`loop_coalesce`	Tries to fuse all loops nested within this loop into a single loop.
`loop_fuse`	Directs the compiler to try and fuse pairs of adjacent loops.
`max_concurrency`	Limits the number of iterations of a loop that can simultaneously execute at any time.
`max_interleaving`	Controls whether iterations of a pipelined inner loop in a loop nest from one invocation of the inner loop can be interleaved in the component data pipeline with iterations from other invocations of the inner loop.
`nofusion`	Prevents the annotated loop from being fused with adjacent loops.
`speculated_iterations`	Specifies the number of clock cycles that a loop exit condition can take to compute.
`unroll`	Unrolls the loop completely or by a number of times.

`disable_loop_pipelining` Loop Pragma

Syntax

#pragma disable_loop_pipelining

Description

Tells the compiler to not pipeline this loop.

Disable loop pipelining for a loop when the loop-carried dependencies cause the loop iterations to effectively execute sequentially. With loop pipelining disabled, the Intel® HLS Compiler can generate a simpler datapath and reduce the FPGA area utilization of your component.

Example:

#pragma disable_loop_pipelining
for (int i = 1; i < N; i++) {
    int j = a[i-1];
    // Memory dependency induces a high-latency loop feedback path
    a[i] = foo(j)
}

`ii` Loop Pragma

Syntax

#pragma ii N

Description

Forces the loop to which you apply this pragma to have a loop initiation interval (II) of <N>, where <N> is a positive integer value.

Forcing a loop II value can have an adverse effect on the f_MAX of your component because using this pragma to get a lower loop II combines pipeline stages together and creates logic with a long propagation delay.

Using this pragma with a larger loop II inserts more pipeline stages and can give you a better component f_MAX value.

Example:

#pragma ii 2
for (int i = 0; i < 8; i++) {
 // Loop body
}

`ivdep` Loop Pragma

Syntax

#pragma ivdep safelen(N) array(array_name)

Description

Tells the compiler to ignore memory dependencies between iterations of this loop.

It can accept an optional argument that specifies the name of the array. If array is not specified, all component memory dependencies are ignored. If there are loop-carried dependencies, your generated RTL produces incorrect results.

The safelen parameter specifies the dependency distance. The dependency distance is the number of iterations between successive load/stores that depend on each other. It is safe to not include safelen is only when the dependence distance is infinite (that is, there are no real dependencies).

Example:

#pragma ivdep safelen(2)
for (int i = 0; i < 8; i++) {
 // Loop body
}

To learn more, review the tutorial: <quartus_installdir>/hls/examples/tutorials/best_practices/loop_memory_dependency.

`loop_coalesce` Loop Pragma

Syntax

#pragma loop_coalesce N

Description

Tells the compiler to try to fuse all loops nested within this loop into a single loop. This pragma accepts an optional value N which indicates the number of levels of loops to coalesce together.

#pragma loop_coalesce 2
for (int i = 0; i < 8; i++) {
 for (int j = 0; j < 8; j++) {
 // Loop body 
 } 
}

`loop_fuse` Block-Scope Loop Pragma

Syntax

#pragma loop_fuse [depth(N)] [independent]

Description

Apply this pragma to a block of code to indicate to the compiler that adjacent loops in the code block should be fused when safe, overriding the compiler profitability analysis of the fusion.

The depth(N) clause sets the number of nesting depths the compiler should consider when fusing adjacent loops. Specifying depth(1) is equivalent to indicating that only adjacent top-level loops should be considered for fusing.

The independent clause overrides the safety checks. If you specify the independent option, you are guaranteeing to the compiler that fusing pairs of loops affected by the loop_fuse pragma is safe. If it is not safe, you might get functional errors in your component.

For details of the safety checks, see the Fusion Criteria section of Loop Fusion.

Example:

#pragma loop_fuse
{
 for (int j=0; j < N; ++j){
   data[j] += Q;
 }
 for (int i = 0; i < N; ++l){
   output[i] = Q * data[i];
 }
 }

`max_concurrency` Loop Pragma

Syntax

#pragma max_concurrency N

Description

This pragma limits the number of iterations of a loop that can simultaneously execute at any time.

This pragma is useful mainly when private copies of are created to improve the throughput of the loop. This is mentioned in the details pane for the loop in the Loop Analysis pane and the Bank view of the Function Memory Viewer of the high level design report (report.html).

This can occur only when the scope of a component memory (through its declaration or access pattern) is limited to this loop. Adding this pragma can be used to reduce the area that the loop consumes at the cost of some throughput.

Example:

// Without this pragma,
// multiple private copies 
// of the array "arr"
#pragma max_concurrency 1
for (int i = 0; i < 8; i++) {
 int arr[1024];
 // Loop body
}

`max_interleaving` Loop Pragma

Syntax

#pragma max_interleaving <option>

Description<option>

This pragma controls whether iterations of a pipelined inner loop in a loop nest from one invocation of the inner loop can be interleaved in the component data pipeline with iterations from other invocations of the inner loop.

By default, the Intel® HLS Compiler tries interleave a number simultaneous invocations of the inner loop equal to the loop initiation interval (II) of the inner loop. For example, an inner loop with an II of 2 can have iterations from two invocations in the pipeline at a time.

In cases where the interleaving of loop iterations from different loop invocations does not yield a performance benefit, limiting or restricting the amount of interleaving can result in reduced FPGA area utilization.

Supported values for <option>:

1
The compiler restricts the annotated (inner) loop to be invoked only once per outer loop iteration. That is, all iterations of the inner loop travel the pipeline before the next invocation of the inner loop can occur.
0
Use the default interleaving behavior.

Example:

// Loop j is pipelined with ii=1
for (int j = 0; j < M; j++) {
  int a[N];
  // Loop i is pipelined with ii=2 
  #pragma max_interleaving 1
  for (int i = 1; i < N; i++) {
      a[i] = foo(i)
  }
  …
}

`nofusion` Loop Pragma

Syntax

#pragma nofusion

Description

This pragma directs the compiler to not fuse the annotated loop with any adjacent loops.

Example:

#pragma nofusion
L1: for (int j=0; j < N; ++j){
 data[j] += Q;
}
L2: for (int i = 0; i < N; ++l) {
 output[i] = Q * data[i];
}

`speculated_iterations` Loop Pragma

Syntax

#pragma speculated_iterations N

Description

This pragma specifies the number of loop iterations to wait before considering a loop exit condition. That is, you estimate that a loop takes at least N loop iterations before the exit condition is met.

If you specify a value that is too low, then the loop II increases to accommodate the iterations required to determine whether the loop exit condition is met.

Example:

component int loop_speculate (int N) {
    int m = 0;
    // The exit path has 2 multiplies and 
    // compare is most critical in loop feedback path
    #pragma speculated_iterations 2
    while (m*m*m < N) {
      m += 1;
    }
    return m;
  }

`unroll` Loop Pragma

Syntax

#pragma unroll N

Description

This pragma unrolls the loop completely or by <N> times, where <N> is optional and is a positive integer value.

Important: Unrolling nested loops with large bounds might generate a large number of instructions that could result in very long compile times for your component.

Example:

#pragma unroll 8
for (int i = 0; i < 8; i++) {
 // Loop body
}

To learn more, review the tutorial: <quartus_installdir>/hls/examples/best_practices/resource_sharing_filter.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

13.7. Intel® HLS Compiler Pro Edition Loop Pragmas

`disable_loop_pipelining` Loop Pragma

`ii` Loop Pragma

`ivdep` Loop Pragma

`loop_coalesce` Loop Pragma

`loop_fuse` Block-Scope Loop Pragma

`max_concurrency` Loop Pragma

`max_interleaving` Loop Pragma

`nofusion` Loop Pragma

`speculated_iterations` Loop Pragma

`unroll` Loop Pragma

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® High Level Synthesis Compiler Pro Edition: Reference Manual

13.7. Intel® HLS Compiler Pro Edition Loop Pragmas

disable_loop_pipelining Loop Pragma

ii Loop Pragma

ivdep Loop Pragma

loop_coalesce Loop Pragma

loop_fuse Block-Scope Loop Pragma

max_concurrency Loop Pragma

max_interleaving Loop Pragma

nofusion Loop Pragma

speculated_iterations Loop Pragma

unroll Loop Pragma

`disable_loop_pipelining` Loop Pragma

`ii` Loop Pragma

`ivdep` Loop Pragma

`loop_coalesce` Loop Pragma

`loop_fuse` Block-Scope Loop Pragma

`max_concurrency` Loop Pragma

`max_interleaving` Loop Pragma

`nofusion` Loop Pragma

`speculated_iterations` Loop Pragma

`unroll` Loop Pragma