Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide

ID 683521
Date 12/13/2021
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

4.1. Transferring Data Via Intel® FPGA SDK for OpenCL™ Channels or OpenCL Pipes

To increase data transfer efficiency between kernels, implement the Intel® FPGA SDK for OpenCL™ channels extension in your kernel programs. If you want to use the capabilities of channels but also have the ability to run your kernel program using other SDKs, implement OpenCL pipes.

Sometimes, FPGA-to-global memory bandwidth constrains the data transfer efficiency between kernels. The theoretical maximum FPGA-to-global memory bandwidth varies depending on the number of global memory banks available in the targeted Custom Platform and board. To determine the theoretical maximum bandwidth for your board, refer to your board vendor's documentation.

In practice, a kernel does not achieve 100% utilization of the maximum global memory bandwidth available. The level of utilization depends on the access pattern of the algorithm.

If global memory bandwidth is a performance constraint for your OpenCL kernel, first try to break down the algorithm into multiple smaller kernels. Secondly, as shown in the figure below, eliminate some of the global memory accesses by implementing the SDK's channels or OpenCL pipes for data transfer between kernels.

Figure 68. Difference in Global Memory Access Pattern as a Result of Channels or Pipes Implementation

For more information about the usage of channels, refer to the Implementing Intel® FPGA SDK for OpenCL™ Channels Extension section of the Intel® FPGA SDK for OpenCL™ Programming Guide.

For more information about the usage of pipes, refer to the Implementing OpenCL Pipes section of the Intel® FPGA SDK for OpenCL™ Programming Guide.