Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide
Visible to Intel only — GUID: xce1521229500128
Ixiasoft
Visible to Intel only — GUID: xce1521229500128
Ixiasoft
5.2.1. Unrolling a Loop
Loop unrolling involves replicating a loop body multiple times, and reducing the trip count of a loop. Unroll loops to reduce or eliminate loop control overhead on the FPGA. In cases where there are no loop-carried dependencies and the offline compiler can perform loop iterations in parallel, unrolling loops can also reduce latency and overhead on the FPGA.
The Intel® FPGA SDK for OpenCL™ Offline Compiler might unroll simple loops even if they are not annotated by a pragma.- Provide an unroll factor whenever possible. To specify an unroll factor N, insert the #pragma unroll <N> directive before a loop in your kernel code.
The offline compiler attempts to unroll the loop at most <N> times.Consider the code fragment below. By assigning a value of 2 as the unroll factor, you direct the offline compiler to unroll the loop twice.
#pragma unroll 2 for(size_t k = 0; k < 4; k++) { mac += data_in[(gid * 4) + k] * coeff[k]; }
- To unroll a loop fully, you may omit the unroll factor by simply inserting the #pragma unroll directive before a loop in your kernel code.
The offline compiler attempts to unroll the loop fully if it understands the trip count. The offline compiler issues a warning if it cannot execute the unroll request.
- To prevent a loop from unrolling, specify an unroll factor of 1 (that is, #pragma unroll 1).