Intel® FPGA SDK for OpenCL™ Pro Edition: Best Practices Guide

ID 683521
Date 12/13/2021
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

8.3.1. Constant Cache Memory

Constant memory resides in global memory, but the kernel loads it into an on-chip cache shared by all work-groups at runtime. For example, if you have read-only data that all work-groups use, and the data size of the constant buffer fits into the constant cache, allocate the data to the constant memory. The constant cache is most appropriate for high-bandwidth table lookups that are constant across several invocations of a kernel. The constant cache is optimized for high cache hit performance.

By default, the constant cache size is 16 kB. You can specify the constant cache size by including the -const-cache-bytes=<N> option in your aoc command, where <N> is the constant cache size in bytes.

Unlike global memory accesses that have extra hardware for tolerating long memory latencies, the constant cache suffers large performance penalties for cache misses. If the __constant arguments in your OpenCL™ kernel code cannot fit in the cache, you might achieve better performance with __global const arguments instead. If the host application writes to constant memory that is already loaded into the constant cache, the cached data is discarded (that is, invalidated) from the constant cache.

For more information about the -const-cache-bytes=<N> option, refer to the Configuring Constant Memory Cache Size (-const-cache-bytes=<N>) section of the Intel® FPGA SDK for OpenCL™ Programming Guide.