Visible to Intel only — GUID: GUID-861FE267-E2BF-455F-A1C5-E33C4A498934
Visible to Intel only — GUID: GUID-861FE267-E2BF-455F-A1C5-E33C4A498934
NDRange Kernels
If your program naturally tends to describe multiple concurrent threads operating in a data-parallel manner, specify your kernel to operate in parallel instances over a work-item index-space (NDRange).
Avoid Work-Item ID-Dependent Backward Branching
The Intel® oneAPI DPC++/C++ Compiler collapses conditional statements into single bits that indicate when a particular functional unit becomes active. The Intel® oneAPI DPC++/C++ Compiler eliminates simple control flow paths that do not involve looping structures, resulting in a flat control structure and more efficient hardware use.
Avoid including any work-item ID-dependent backward branching (that is, branching that occurs in a loop) in your kernel because it degrades performance.
For example, the following code fragment illustrates branching that involves work-item ID such as get_global_id or get_local_id:
for (size_t i = 0; i < get_global_id(0); i++)
{
// statements
}