Visible to Intel only — GUID: GUID-6A6003F1-BAD5-4648-AB10-A8959D4A8B29
Visible to Intel only — GUID: GUID-6A6003F1-BAD5-4648-AB10-A8959D4A8B29
Use Preprocessor for Constants
Consider the following kernel:
__kernel void exponentor(__global int* data, const uint exponent) { int tid = get_global_id(0); int base = data[tid]; for (int i = 1; i < exponent; ++i) { data[tid] *= base; } }
The number of iterations for the inner for loop is determined at runtime, after the kernel is issued for execution. However, you can use OpenCL™ dynamic compilation feature to ensure the exponent is known at kernel compile time, which is done during the host run time. In this case, the kernel appears as follows:
__kernel void exponentor(__global int* data) { int tid = get_global_id(0); int base = data[tid]; for (int i = 1; i < EXPONENT; ++i) { data[tid] *= base; } }
The capitalization indicates that EXPONENT is a preprocessor macro.
The original version of the host code passes exponent_val through kernel arguments as follows:
clSetKernelArg(kernel, 1, exponent_val);
The updated version uses a compilation step:
sprintf(buildOptions, “-DEXPONENT=%u”, exponent_val); clBuildProgram(program, <...>, buildOptions, <...>);
Thus, the value of the EXPONENT is passed during preprocessing of the kernel code. Besides saving stack space used by the kernel, this also enables the compiler to perform optimizations, such as loop unrolling or elimination.