Developer Guide

Intel oneAPI DPC++/C++ Compiler Handbook for Intel FPGAs

ID 785441
Date 5/08/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Additional Recommendations

Optimizing memory accesses in your kernels can improve overall kernel performance. Consider implementing the following techniques for optimizing memory accesses:

  • Avoid designing systems where one kernel writes an intermediate result to global memory and another kernel reads this data back from global memory. Instead, implement a SYCL* pipe (described in Pipes) between the producer and consumer kernels for direct data transfer. Alternatively, you can merge both kernels into a single larger kernel and use helper functions to logically separate the two original kernels.
  • The Intel® oneAPI DPC++/C++ Compiler implements local memory in FPGAs differently than in GPUs. If your kernel contains code to avoid GPU-specific local memory bank conflicts, remove that code because the compiler generates hardware that avoids local memory bank conflicts automatically whenever possible.