Visible to Intel only — GUID: ewa1396463252140
Ixiasoft
Visible to Intel only — GUID: ewa1396463252140
Ixiasoft
6.1. Addressing Single Work-Item Kernel Dependencies Based on Optimization Report Feedback
The following flowchart outlines the approach you can take to iterate on your design and optimize your single work-item kernel. For usage information on the Intel® FPGA SDK for OpenCL™ Emulator and the Profiler, refer to the Emulating and Debugging Your OpenCL Kernel and Profiling Your OpenCL Kernel sections of the Intel® FPGA SDK for OpenCL™ Programming Guide, respectively. For information on the Intel® FPGA dynamic profiler for OpenCL™ GUI and profiling information, refer to the Profile Your Kernel to Identify Performance Bottlenecks section.
Intel® recommends the following optimization options to address single work-item kernel loop-carried dependencies, in order of applicability: removal, relaxation, simplification, and transfer to local memory.
- Removing Loop-Carried Dependency
Based on the feedback from the optimization report, you can remove a loop-carried dependency by implementing a simpler memory access pattern. - Relaxing Loop-Carried Dependency
Based on the feedback from the optimization report, you can relax a loop-carried dependency by increasing the dependence distance. - Transferring Loop-Carried Dependency to Local Memory
For a loop-carried dependency that you cannot remove, improve the II by moving the array with the loop-carried dependency from global memory to local memory. - Relaxing Loop-Carried Dependency by Inferring Shift Registers
To enable the Intel® FPGA SDK for OpenCL™ Offline Compiler to handle single work-item kernels that carry out double precision floating-point operations efficiently, remove loop-carried dependencies by inferring a shift register. - Removing Loop-Carried Dependencies Caused by Accesses to Memory Arrays
Include the ivdep pragma in your single work-item kernel to assert that accesses to memory arrays do not cause loop-carried dependencies.