Visible to Intel only — GUID: sfz1565797370740
Ixiasoft
Visible to Intel only — GUID: sfz1565797370740
Ixiasoft
6.8. Loop Interleaving Control (max_interleaving Pragma)
- Terminology Reminder
- A loop iteration is the single execution of a loop body. A loop invocation is the start of pipelined execution of loop iterations.
Interleaving Example 2
// Loop j is pipelined with ii=1
for (int j = 0; j < M; j++) {
int a[N];
// Loop i is pipelined with ii=2
for (int i = 1; i < N; i++) {
a[i] = foo(i)
}
}
In this example, the inner loop i is pipelined with II=2. Under normal pipelining, this II means that the inner loop hardware only achieves 50% utilization, since one iteration of the i loop is initiated every other cycle. To take advantage of these idle cycles, the compiler interleaves a second invocation of the i loop from the next iteration of the outer j loop.
Because the i loop resides inside the j loop, and the j loop has a trip count of M, the i loop is invoked M times. The j loop is the outermost loop and is invoked once.
The following table shows the difference between normal pipelined execution of the i loop versus interleaved execution for this example for N=5.
Cycle | Pipelined Loop Iterations (j loop, i loop) |
Interleaved Loop Iterations (j loop, i loop) |
---|---|---|
0 | (0,0) | (0,0) |
1 | --- | (1,0) |
2 | (0,1) | (0,1) |
3 | --- | (1,1) |
4 | (0,2) | (0,2) |
5 | --- | (1,2) |
6 | (0,3) | (0,3) |
7 | --- | (1,3) |
8 | (0,4) | (0,4) |
9 | --- | (1,4) |
10 | (1,0) | (2,0) |
11 | --- | (3,0) |
12 | (1,1) | (2,1) |
13 | --- | (3,1) |
14 | (1,2) | (2,2) |
15 | --- | (3,2) |
16 | (1,3) | (2,3) |
17 | --- | (3,3) |
18 | (1,4) | (2,4) |
19 | --- | (3,4) |
This table shows the values (j,i) for each inner loop iteration that is initiated at each cycle. At cycle 0, both modes of execution initiate the (0,0)th iteration of the i loop. Under normal pipelined execution, no i loop iteration is initiated at cycle 1. Under interleaved execution, the (1,0)th iteration of the innermost loop, i.e. the first iteration of the next (j=1) invocation of the i loop, is initiated. By cycle 10, interleaved execution has initiated all of the iterations of both the j=0 invocation of the i loop, and the j=1 invocation of the i loop. This represents twice the efficiency of the normal pipelined execution.
Sometimes you might determine that this interleaving does not give you a performance benefit relative to the additional FPGA area needed to enable interleaving. In these cases, you can limit or restrict the amount of interleaving to reduce FPGA area utilization.
Using the max_interleaving Pragma
To limit the number of interleaved invocations of an inner loop that can be executed simultaneously, annotate the inner loop with the max_interleaving pragma. The annotated loop must be contained inside another pipelined loop.
The required parameter ( n) specifies an upper bound on the degree of interleaving allowed, That is, how many invocations of the containing loop can execute the annotated loop at a given time.
- #pragma max_interleaving 1
The compiler restricts the annotated (inner) loop to be invoked only once per outer loop iteration. That is, all iterations of the inner loop travel the pipeline before the next invocation of the inner loop can occur.
- #pragma max_interleaving 0
The compiler allows the pipeline to contain a number simultaneous invocations of the inner loop equal to the loop initiation interval (II) of the inner loop. For example, an inner loop with an II of 2 can have iterations from two invocations in the pipeline at a time.
This behavior is the default behavior for the compiler if you do not specify the max_interleaving pragma.
// Loop j is pipelined with ii=1
for (int j = 0; j < M; j++) {
int a[N];
// Loop i is pipelined with ii=2
#pragma max_interleaving 1
for (int i = 1; i < N; i++) {
a[i] = foo(i)
}
…
}
<quartus_installdir>/hls/examples/tutorials/loop_controls/max_interleaving