Visible to Intel only — GUID: kdx1590074924520
Ixiasoft
Visible to Intel only — GUID: kdx1590074924520
Ixiasoft
5.2.15. Specifying the use_stall_enable_clusters Cluster-control Attribute
__kernel void __attribute__((use_stall_enable_clusters))
example(__global int * restrict input,
__global int * restrict output, int size){
for(int i = 0; i < size; ++i){
output[i] = input[i];
}
...
}
The Intel® FPGA SDK for OpenCL™ Offline Compiler typically groups related operations into clusters. In several scenarios, the clusters are stall-free clusters. A stall-free cluster executes the operations without any stalls and contains a FIFO at the end of the cluster that holds the results if the cluster is stalled. This FIFO adds area and latency to the kernel, but might allow a higher fMAX and increased throughput.
If you prefer lower FPGA area use and lower latency over higher throughput, use the __attribute__((use_stall_enable_clusters)) attribute to bias the compiler to produce stall-enabled clusters. Stall-enabled clusters lack an exit FIFO to buffer all data in the event that the whole SFC is stalled, which reduces area and latency, but passes stall signals to the contained operations. Passing stall signals might reduce fMAX.
Not all operations support stall, and these operations cannot be contained in a stall-enabled cluster. The compiler generates a warning if some operations cannot be placed into a stall-enabled cluster.
The compiler automatically uses stall-free clusters for kernels as they are generally more beneficial. This attribute requests the compiler to form stall-enabled clusters if possible.