Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide
A newer version of this document is available. Customers should click here to go to the newest version.
Visible to Intel only — GUID: ran1551293458343
Ixiasoft
Visible to Intel only — GUID: ran1551293458343
Ixiasoft
11.2.1. Double Buffered Host Application Utilizing Kernel Invocation Queue
To utilize hardware kernel invocation queue while double buffering, write your host code as shown in the following code snippet:
int main() { … cl_event dependencies[2]; for (int i=0; i<MAX_ITERATIONS; i++) { if (i < 2) { clEnqueueWriteBuffer(writeQ, inputBufferD[i%2], CL_FALSE, …, inputBufferH[i], 0, NULL, &writeEvent[i]); clFlush(writeQ); clSetKernelArg(kernel, 0, sizeof(cl_mem *), &inputBufferD[i%2]); clSetKernelArg(kernel, 1, sizeof(cl_mem *), &outputBufferD[i%2]); clEnqueueNDRangeKernel(kernelQ, kernel, …, 1, &writeEvent[i], &kernelEvent[i]); clFlush(kernelQ); } else { clEnqueueWriteBuffer(writeQ, inputBufferD[i%2], CL_FALSE, …, inputBufferH[i], 1, &kernelEvent[i-2], &writeEvent[i]); clFlush(writeQ); dependencies[0] = writeEvent[i]; dependencies[1] = readEvent[i-2]; clSetKernelArg(kernel, 0, sizeof(cl_mem *), &inputBufferD[i%2]); clSetKernelArg(kernel, 1, sizeof(cl_mem *), &outputBufferD[i%2]); clEnqueueNDRangeKernel(kernelQ, kernel, …, 2, dependencies, &kernelEvent[i]); clFlush(kernelQ); } clEnqueueReadBuffer(readQ, output_device[i%2], CL_FALSE, …, outputBufferH[i], 1, &kernelEvent[i], &readEvent[i]); clFlush(readQ); } … }
The following diagram helps you in visualizing the event dependency:
The following figure illustrates the order the commands are executed on the device assuming kernel execution is longer than reads and writes, and the device supports concurrent reads and writes: