11.2.1. Double Buffered Host Application Utilizing Kernel Invocation...

Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide

Download PDF

ID 683521

Date 3/28/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

11.2.1. Double Buffered Host Application Utilizing Kernel Invocation Queue

Double buffering in OpenCL host application allows OpenCL runtime environment to coalesce memory transfers and kernel execution.

To utilize hardware kernel invocation queue while double buffering, write your host code as shown in the following code snippet:

int main()
{	…
  cl_event dependencies[2];
  for (int i=0; i<MAX_ITERATIONS; i++) {
    if (i < 2) {
      clEnqueueWriteBuffer(writeQ,  inputBufferD[i%2],  CL_FALSE,  …,  inputBufferH[i],  0,  NULL,  &writeEvent[i]);
      clFlush(writeQ);
      clSetKernelArg(kernel,  0,  sizeof(cl_mem *),  &inputBufferD[i%2]);
      clSetKernelArg(kernel,  1,  sizeof(cl_mem *),  &outputBufferD[i%2]);
      clEnqueueNDRangeKernel(kernelQ,  kernel,  …,  1,  &writeEvent[i],  &kernelEvent[i]);
      clFlush(kernelQ);
    } else {
      clEnqueueWriteBuffer(writeQ,  inputBufferD[i%2],  CL_FALSE,  …,  inputBufferH[i],  1,  &kernelEvent[i-2],  &writeEvent[i]); 
      clFlush(writeQ);
      dependencies[0] = writeEvent[i];
      dependencies[1] = readEvent[i-2];
      clSetKernelArg(kernel,  0,  sizeof(cl_mem *),  &inputBufferD[i%2]);
      clSetKernelArg(kernel,  1,  sizeof(cl_mem *),  &outputBufferD[i%2]);
      clEnqueueNDRangeKernel(kernelQ,  kernel,  …,  2,  dependencies,  &kernelEvent[i]);
      clFlush(kernelQ);
    }
    clEnqueueReadBuffer(readQ,  output_device[i%2],  CL_FALSE,  …,  outputBufferH[i],  1,  &kernelEvent[i],  &readEvent[i]);
    clFlush(readQ);
  }
  …
}

The following diagram helps you in visualizing the event dependency:

Note: Arrows represent the source of event in the event wait list.

Figure 93. Event Dependency Graph

The following figure illustrates the order the commands are executed on the device assuming kernel execution is longer than reads and writes, and the device supports concurrent reads and writes:

Figure 94. Order of Event Execution

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide

11.2.1. Double Buffered Host Application Utilizing Kernel Invocation Queue