Visible to Intel only — GUID: tac1582215172557
Ixiasoft
Visible to Intel only — GUID: tac1582215172557
Ixiasoft
6.9.1. Loop Fusion Control (loop_fuse Pragma)
Use the loop_fuse pragma to tell the compiler to try to fuse two adjacent loops without affecting the functionality of either loop, overriding the compiler profitability analysis of fusing the loops.
Fusing adjacent loops helps reduce the amount of loop control overhead in your component, which helps reduce the FPGA area used and can increase the performance by executing both loops as one (fused) loop.
#pragma loop_fuse [depth(N)] [independent]
{
...
}
By default (or depth(1)), only loops L1 and L2 are initially considered for fusing. |
|
With depth(2), the following loop pairs are initially considered for fusing:
|
With depth(3), the following loop pairs are initially considered for fusing:
|
The compiler automatically considers fusing adjacent loops with equal trip counts when the loops meet the criteria. You can also use the loop_fuse pragma to tell the compiler to consider fusing adjacent loops with different trip counts.
With the loop_fuse pragma applied to a block of code, the compiler always tries to fuse adjacent loops (with equal or different trip counts) in the block whenever the compiler determines that it is safe to fuse the loops. Two loops are considered safe to merge if they meet the fusion criteria described in Fusion Criteria section of Loop Fusion.
Unfused Loops | Fused Loop |
---|---|
|
|
#pragma loop_fuse
{
L1 for(...) {}
L2 for(...) {
L3 for(...) {}
}
L4 for(...) {}
}
Use the independent option to override the dependency safety checks. If you specify the independent option, you are guaranteeing to the compiler that fusing pairs of loops affected by the loop_fuse pragma is safe. That is, there are no negative-distance dependencies in the fused loop. If it is not safe, you might get functional errors in your component.
Function Calls In loop_fuse Code Blocks
If a function call occurs in a code block annotated with the loop_fuse pragma and inlining that function call contains a loop, the resulting loop can be a candidate for loop fusion.
Nested depth(N) Clauses
When you nest loop_fuse pragmas, you might create overlapping sets of candidates loops.
#pragma loop_fuse depth(2) independent
{
L1: for(...) {}
L2: for(...) {
#pragma loop_fuse depth(2)
{
L3: for(...) {}
L4: for(...) {
L5: for(...) {}
L6: for(...) {}
}
}
}
}
In this example, the compiler considers the following loop pairs for fusion: L1/L2, L3/L4, and L5/L6. In addition, the compiler overrides the compiler negative-distance dependency analysis of the following loops pairs: L1/L2, L3/L4.
<quartus_installdir>/hls/examples/tutorials/best_practices/loop_fusion