Visible to Intel only — GUID: GUID-7E1AF44A-F368-4206-92A1-C7880494A387
Visible to Intel only — GUID: GUID-7E1AF44A-F368-4206-92A1-C7880494A387
Pipeline Loops in Non-task Kernels (-Xsauto-pipeline)
To direct the Intel® oneAPI DPC++/C++ Compiler to compile your design and pipeline loops in non-task (parallel_for) kernels, include the -Xsauto-pipeline option in your icpx command. The host program invokes non-task kernels through the kernel execution function parallel_for, parallel_for_work_item, or parallel_for_work_group.
Example
icpx -fintelfpga –Xshardware -Xsauto-pipeline <source_file>.cpp
With the -Xsauto-pipeline option, the compiler attempts to pipeline the loops in your design, but the pipelining is not guaranteed. If you do not include the -Xsauto-pipeline option, the compiler does not pipeline the loops in parallel_for kernels. However, it executes different work items in parallel.
The -Xsauto-pipeline option might improve or degrade performance depending on the memory access pattern in your design.
- If the auto-pipelining is successful, the Loop Analysis report displays the message Auto-pipelined parallel_for and parallel_for rewritten as a pipelined single_task (Details pane) . The compiler-generated loops appear marked as Compiler generated auto-pipeline loop in the report.
- If the compiler chooses not to auto-pipeline the loops, the Loop Analysis report displays a message for the kernel. The reasons for not auto-pipelining a loop can be one of the following:
- A barrier in the function is not at the top-level function scope.
- Kernel uses a local or private memory.
- Kernel uses a volatile or atomic memory, or channels.
If you do not want the compiler to pipeline some infrequently used loops while allowing other loops to be auto-pipelined, use the [[intel::disable_loop_pipelining]] loop directive on specific loops when using the -Xsauto-pipeline option. This loop directive disables the loop pipelining.