FPGA Kernel Attributes

Developer Guide

Intel oneAPI FPGA Handbook

Download PDF

ID 785441

Date 2/07/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

FPGA Kernel Attributes

The following table summarizes kernel attributes:

FPGA Kernel Attributes
Attribute	Description	Example
`[[intel::scheduler_target_fmax_mhz(N)]]`	Determines the pipelining effort the scheduler attempts during the scheduling process.	`[[intel::scheduler_target_fmax_mhz(SCHEDULER_TARGET_FMAX)]] { for (unsigned i = 0; i < SIZE; i++) { accessorRes[0] += accessorIdx[i] * 2; } });`
`[[intel::max_work_group_size(Z, Y, X)]]`	Specifies a maximum or the required work-group size for optimizing hardware use of the SYCL kernel without involving excess logic.	`[[intel::max_work_group_size(1,1,MAX_WG_SIZE)]] { accessorRes[wiID] = accessorIdx[wiID] * 2; });`
`[[intel::max_global_work_dim(0)]]`	This attribute is deprecated. The compiler automatically adds this attribute for any single-task kernel, so adding this attribute explicitly is no longer required. Omits logic that generates and dispatches global, local, and group IDs into the compiled kernel.	`[[intel::max_global_work_dim(0)]] { for (unsigned i = 0; i < SIZE; i++) { accessorRes[i] = accessorIdx[i] * 2; } }`
`[[intel::num_simd_work_items(N)]`	Specifies the number of work items within a work group that the compiler executes in a SIMD or vectorized manner.	`[[intel::num_simd_work_items(NUM_SIMD_WORK_ITEMS), cl::reqd_work_group_size(1,1,REQD_WORK_GROUP_SIZE)]] { accessorRes[wiID] = sqrt(accessorIdx[wiID]); });`
`[[intel::no_global_work_offset(1)]]`	Omits generating hardware required to support global work offsets.	`[[intel::no_global_work_offset(1))]] { accessorRes[wiID] = accessorIdx[wiID] * 2; }`
`[[intel::kernel_args_restrict]]`	Ignores the dependencies between accessor arguments in a SYCL* kernel.	`[[intel::kernel_args_restrict]] { for (unsigned i = 0; i < size; i++) { out_accessor[i] = in_accessor[i]; } });`
`[[intel::use_stall_enable_clusters]]`	Reduces the area and latency of your kernel.	`h.single_task<class KernelComputeStallFree>( [=]() [[intel::use_stall_enable_clusters]] { // The computations in this device kernel // uses Stall Enable Clusters Work(accessor_vec_a, accessor_vec_b, accessor_res); }); });`

Parent topic: Quick Reference