Developer Guide

Intel oneAPI FPGA Handbook

ID 785441
Date 2/07/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

FPGA Kernel Attributes

The following table summarizes kernel attributes:

FPGA Kernel Attributes
Attribute Description Example
[[intel::scheduler_target_fmax_mhz(N)]] Determines the pipelining effort the scheduler attempts during the scheduling process.
[[intel::scheduler_target_fmax_mhz(SCHEDULER_TARGET_FMAX)]]
 {
  for (unsigned i = 0; i < SIZE; i++) {
    accessorRes[0] += accessorIdx[i] * 2;
  }
});
[[intel::max_work_group_size(Z, Y, X)]]

Specifies a maximum or the required work-group size for optimizing hardware use of the SYCL kernel without involving excess logic.

[[intel::max_work_group_size(1,1,MAX_WG_SIZE)]]
 {
  accessorRes[wiID] = accessorIdx[wiID] * 2;
});
[[intel::max_global_work_dim(0)]]

This attribute is deprecated. The compiler automatically adds this attribute for any single-task kernel, so adding this attribute explicitly is no longer required.

Omits logic that generates and dispatches global, local, and group IDs into the compiled kernel.

[[intel::max_global_work_dim(0)]]
 {
  for (unsigned i = 0; i < SIZE; i++) {
    accessorRes[i] = accessorIdx[i] * 2;
  }
}
[[intel::num_simd_work_items(N)]

Specifies the number of work items within a work group that the compiler executes in a SIMD or vectorized manner.

[[intel::num_simd_work_items(NUM_SIMD_WORK_ITEMS),
cl::reqd_work_group_size(1,1,REQD_WORK_GROUP_SIZE)]]
 {
  accessorRes[wiID] = sqrt(accessorIdx[wiID]);
});
[[intel::no_global_work_offset(1)]]

Omits generating hardware required to support global work offsets.

[[intel::no_global_work_offset(1))]]
 {
  accessorRes[wiID] = accessorIdx[wiID] * 2;
}
[[intel::kernel_args_restrict]]

Ignores the dependencies between accessor arguments in a SYCL* kernel.

[[intel::kernel_args_restrict]]
 {
  for (unsigned i = 0; i < size; i++) {
    out_accessor[i] = in_accessor[i];
  }
});
[[intel::use_stall_enable_clusters]]

Reduces the area and latency of your kernel.

h.single_task<class KernelComputeStallFree>( [=]()
 [[intel::use_stall_enable_clusters]] {
  // The computations in this device kernel 
  // uses Stall Enable Clusters
  Work(accessor_vec_a, accessor_vec_b, accessor_res);
});
});