Visible to Intel only — GUID: GUID-93ACED2B-B2C8-4F2C-AD64-146791ABCC6A
Visible to Intel only — GUID: GUID-93ACED2B-B2C8-4F2C-AD64-146791ABCC6A
Memory Attributes
The following table lists attributes that allow fine-grained control on how you can implement a variable (usually an array) in on-chip memory. The attribute immediately precedes the variable declaration.
Attribute |
Syntax |
Description |
---|---|---|
bank_bits |
[[intel::bank_bits(b0,b1,...,bn)]] |
Specifies that the local memory addresses should use bits (b0,b1,...,bn) for bank selection, where, (b0,b1,...,bn) are indicated in terms of word-addressing and not byte-addressing. As a result, the number of banks is equal to 2<number of bank bits>. The bits of the local memory address not included in (b0,b1,...,bn) are used for word selection in each bank. |
bankwidth |
[[intel::bankwidth(N)]] |
Specifies that the memory system implementing the local variable must have banks that are N bytes wide, where N is a power-of-2 integer value greater than zero. |
doublepump |
[[intel::doublepump]] |
Specifies that the memory system implementing the local variable must operate at twice the clock frequency of the kernel accessing it. This allows twice as many memory accesses per kernel clock cycle but may reduce the maximum kernel clock frequency. |
force_pow2_depth |
[[intel::force_pow2_depth(N)]] |
Specifies that the memory implementing the variable or array has a power-of-2 depth. This attribute is enabled if N is 1, and disabled if N is 0. |
max_replicates |
[[intel::max_replicates(N)]] |
Specifies that the memory implementing the local variable or array has no more than the specified number of replicates to enable simultaneous reads from the datapath. |
fpga_memory |
[[intel::fpga_memory(<impl_type>)]] |
Specifies that the compiler must implement the local variable in a memory system. You may pass an optional string argument to specify the memory implementation type. Specify <impl_type> as either a BLOCK_RAM or MLAB to implement the memory using memory blocks (for example, M20K) or memory logic array blocks (MLABs), respectively. |
merge |
[[intel::merge(<key>, <direction>)]] |
Merges of two or more variables or arrays defined in the same scope in a width-wise or depth-wise manner. All variables with the same key string are merged into the same memory system. The string <direction> can be either width or depth. |
numbanks |
[[intel::numbanks(N)]] |
Specifies that the memory system implementing the local variable must have N banks, where N is a power-of-2 integer value greater than zero. |
private_copies |
[[intel::private_copies(N)]] |
Specifies that the memory has a defined number of copies to allow simultaneous iterations of a loop at any given time. When you declare an array in the scope of a loop body, the array is private to that iteration of the loop. However, concurrency is limited if all iterations of the loop must share the same physical memory. Specifying private_copies(N) allows N iterations of the loop to execute concurrently, each with its own private copy of the array. A larger value of N may expose more opportunities for parallel execution, at a cost of higher on-chip memory utilization. private_copies(N) interacts with max_concurrency attribute applied to the loop. For more information, refer to the max_concurrency Attribute and FPGA tutorial sample "Private Copies" on GitHub. |
fpga_register |
[[intel::fpga_register]] |
Specifies that the variable must be carried through the pipeline in registers. The compiler may implement a register variable either exclusively in flip-flops (FFs), or in a combination of FFs and RAM-based FIFOs. You can also apply this attribute to device_global variables. However, the device_global variables must be used in a single kernel and they must not have any host accesses. The fpga_register memory attribute is functionally equivalent to the fpga_datapath class template.
RESTRICTION:
You cannot apply this attribute to device_global struct/class member variables. The compiler ignores the fpga_register attribute when it is applied to device_global struct/class member variables.
|
simple_dual_port |
[[intel::simple_dual_port]] |
Specifies that the memory implementing the variable or array should have read-only and write-only ports rather than read or write ports. Specifying simple_dual_port forces the compiler to configure the memory's underlying hardware resources in simple dual port mode, which can have area benefits in some corner cases. |
singlepump |
[[intel::singlepump]] |
Specifies that the memory system implementing the local variable must operate at the same clock frequency as the kernel accessing it. |
For additional information, refer to the FPGA tutorial sample "Memory Attributes" on GitHub.
Struct Data Types and Memory Attributes
You can apply memory attributes to the member variables in a struct variable within the struct declaration. If you also apply memory attributes to the object instantiation of a struct variable, the attributes on the instantiation override the attributes from the declaration.
Consider the following code example where memory attributes are applied to both a declaration and instantiation:
struct State {
[[intel::fpga_memory]] int array[100];
[[intel::fpga_register]] int reg[4];
};
cgh.single_task<class test>([=] {
struct State S1;
[[intel::fpga_memory]] struct State S2;
// some uses
});
In this example code, the compiler splits S1 into two variables, S1.array[100] (implemented in the memory) and S1.reg[4] (implemented in registers). However, the compiler ignores attributes applied at the struct declaration for object S2 and does not split it since the S2 object has the [[intel::fpga_memory]] attribute applied to it.