Visible to Intel only — GUID: xco1517938439021
Ixiasoft
Visible to Intel only — GUID: xco1517938439021
Ixiasoft
2.8.1. Kernels
For more information on built-in work-item functions, refer to section 6.11.1: Work-Item Functions of the OpenCL Specification version 1.0.
For single work-item kernels, the offline compiler attempts to pipeline every loop in the kernel to allow multiple loop iterations to execute concurrently. Kernel performance might degrade if the compiler cannot pipeline some of the loops effectively, or if it cannot pipeline the loops at all.
The offline compiler cannot pipeline loops in NDRange kernels. However, these loops can accept multiple work-items simultaneously. A kernel might have multiple loops, each with nested loops. If you tabulate the total number of iterations of nested loops for each outer loop, kernel throughput is usually reduced by the largest total iterations value that you have tabulated.
To execute an NDRange kernel efficiently, there usually needs to be a large number of threads.