Kernel Attributes
This topic lists the kernel attributes and kernel argument type qualifiers used in OpenCL and their closely equivalent attributes and arguments in SYCL*:
OpenCL Syntax | SYCL Syntax | Description |
---|---|---|
__attribute__((max_work_group_size(X, Y, Z))) | [[intel::max_work_group_size(Z, Y, X)]] [[cl::reqd_work_group_size(Z, Y, X)]] |
Specifies a maximum or the required work-group size whenever possible. |
__attribute__((max_global_work_dim(0))) | [[intel::max_global_work_dim(0)]] | Instructs the compiler to omit logic that generates and dispatches global, local, and group IDs into the compiled kernel. |
__attribute__((num_simd_work_items(N))) | [[intel::num_simd_work_items(N)]] | Specifies the number of single instruction multiple data (SIMD) work items in a work group. |
__attribute__((uses_global_work_offset(0))) | [[intel::no_global_work_offset(1)]] | Instructs the compiler to omit hardware required to support a non-zero or non-NULL global_work_offset argument in NDRange kernels. |
__attribute__((scheduler_target_fmax_mhz(__x))) | [[intel::scheduler_target_fmax_mhz(N)]] | Determines the pipelining effort the scheduler attempts during the scheduling process. |
__attribute__((num_compute_units(6))) | Use C++ meta-programming to replicate your kernel. See compute_units tutorial on GitHub. | Replicates your single work-item kernel. In many situations where you might use NUM_COMPUTE_UNITS in OpenCL, you can express the same algorithm in SYCL using templated functions as each unique template instantiation of a kernel results in a different physical copy of the hardware. |
Kernel Argument Type Qualifiers
In SYCL, kernel arguments need not be explicitly listed as they are in OpenCL kernels. Information that was conveyed through kernel argument attributes in OpenCL is instead conveyed through properties in SYCL as described in the following table:
OpenCL | SYCL | Description |
---|---|---|
buffer_location<index> | intel::buffer_location<N> | Instructs the host to allocate a buffer to a specific global memory type. |
volatile | Not supported | The volatile keyword is a directive to the compiler that the pointer or variable data can be changed by a code outside of the scope of the current block of code. Suppose you want to work cooperatively on some memory across two different kernels or work-items in SYCL. In that case, you must use synchronization primitives of some type. For example, atomics, pipes, fences, or barriers. |
restrict | Use ext::oneapi::no_alias Consider the following example: ext::oneapi::accessor_property_list PL{ext::oneapi::no_alias}; accessor acc(buffer, cgh, PL); |
In SYCL, no_alias notifies the Intel® oneAPI DPC++/C++ Compiler that all modifications to the memory locations accessed (directly or indirectly) by an accessor during kernel execution is done through the same accessor (directly or indirectly) and not by any other accessor or USM pointer in the kernel. |
autorun | Submit the kernel to a queue from within the constructor of a global variable (at global scope). | Launches the kernel before main(). For more information about how to create an equivalent of OpenCL autorun kernels in oneAPI , refer to the oneAPI-specific Autorun Kernels tutorial on GitHub*. |
local_mem_size<N> | Use function scope local memory. | Specifies the local memory size other than the default of 16 kilobytes (kB). |