Visible to Intel only — GUID: oco1521223871714
Ixiasoft
Visible to Intel only — GUID: oco1521223871714
Ixiasoft
5.10. Inferring a Register
The offline compiler infers private arrays as registers either as single values or in a piecewise fashion. Piecewise implementation results in very efficient hardware; however, the offline compiler must be able to determine data accesses statically. To facilitate piecewise implementation, hardcode the access points into the array. You can also facilitate register inference by unrolling loops that access the array.
If array accesses are not inferable statically, the offline compiler might infer the array as registers. However, the offline compiler limits the size of these arrays to 64 bytes in length for single work-item kernels. There is effectively no size limit for kernels with multiple work-items.
Consider the following code example:
int array[SIZE];
for (int j = 0; j < N; ++j)
{
for (int i = 0; i < SIZE - 1; ++i)
{
array[i] = array[i + 1];
}
}
The indexing into array[i] is not inferable statically because the loop is not unrolled. If the size of array[SIZE] is less than or equal to 64 bytes for single work-item kernels, the offline compiler implements array[SIZE] into registers as a single value. If the size of array[SIZE] is greater than 64 bytes for single work-item kernels, the offline compiler implements the entire array in block RAMs. For multiple work-item kernels, the offline compiler implements array[SIZE] into registers as a single value as long as its size is less than 1 kilobyte (KB).