7.2. Component Gets Bad Quality of Results
The information in this section describes some common sources of stallable arbitration nodes or excess RAM utilization.
Component Uses More FPGA Resource Than Expected
By default, the Intel® HLS Compiler Standard Edition tries to optimize your component for the best throughput by trying to maximize the maximum operating frequency (fMAX).
A way to reduce area consumption is to relax the fMAX requirements by setting a target fMAX value with the --clock i++ command option. The HLS compiler can often achieve a higher fMAX than you specify, so when you set a target fMAXr to a lower value than you need, your design might still achieve an acceptable fMAX value, and a design that consumes less area.
Incorrect Bank Bits
If you access parts of an array in parallel (either a single- or multidimensional array), you might need to configure the memory bank selection bits.
See Memory Architecture Best Practices for details about how to configure efficient memory systems.
Conditional Operator Accessing Two Different Arrays of struct Variables
In some cases, if you try to access different arrays of struct variables with a conditional operator, the Intel® HLS Compiler merges the arrays into the same RAM block. You might see stallable arbitration in the Component Memory Viewer because there are not enough Load/Store site on the memory system.
struct MyStruct {
float a;
float b;
}
MyStruct array1[64];
MyStruct array2[64];
MyStruct value = (shouldChooseArray1) ? array1[idx] : array2[idx];
MyStruct value;
if (shouldChooseArray1)
{
value = array1[idx];
} else
{
value = array2[idx];
}
File-Scoped Static Variables
The Intel® HLS Compiler Standard Edition supports file-scoped static variables, but any memory attributes that you apply to static arrays work only if the static array is declared within the component function. Memory attributes applied to file-scope static variables are ignored. Memory attributes applied to a variable are also ignored if you attempt to apply attributes to a array members in a struct or class definition.
If you want to override the default memory settings for an array variable, ensure that the array variable is declared in the scope of the component function where the array variable is used. You can pass pointers to the static array to any subroutines that might access the static array.
This code change is shown in the following example. The code samples and high-level design report views that follow compare two implementations of a component that reads data from a stream into a local memory, then processes the data that is in that local memory.
In the first code example, the local memory is a file-scoped static variable. In the second code example, the local memory is a function-scoped static variable.
The second code example gets better QoR because you can apply memory optimization attributes to the static variable declaration. In this second example, the hls_memory and hls_numbanks(1) attributes force the static array into a single bank of on-chip RAM blocks.
hls_memory hls_numbbanks(1) static int myStaticArray[64];
void loadData(ihc::stream_in<int> &intStreamIn)
{
for(int idx = 0; idx < 64; idx++)
{
myStaticArray[idx] = intStreamIn.read();
}
}
int findMax()
{
int maxVal = 0;
for(int idx = 0; idx < 64; idx++)
{
int val = myStaticArray[idx];
if (val > maxVal)
{
maxVal = val;
}
}
return maxVal;
}
component
int dut(ihc::stream_in<int> &intStreamIn)
{
loadData(intStreamIn);
return findMax();
}
void loadData(ihc::stream_in<int> &intStreamIn, int myStaticArray[64])
{
for(int idx = 0; idx < 64; idx++)
{
myStaticArray[idx] = intStreamIn.read();
}
}
int findMax(int myStaticArray[64])
{
int maxVal = 0;
for(int idx = 0; idx < 64; idx++)
{
int val = myStaticArray[idx];
if (val > maxVal)
{
maxVal = val;
}
}
return maxVal;
}
component
int dut(ihc::stream_in<int> &intStreamIn)
{
hls_memory hls_numbbanks(1) static int myStaticArray[64];
loadData(intStreamIn, myStaticArray);
return findMax(myStaticArray);
}
Cluster Logic
Your design might consume more RAM blocks than you expect, especially if you store many array variables in large registers. The Area Analysis of System report in the high-level design report (report.html) can help find this issue.
The three matrices are stored intentionally in RAM blocks, but the RAM blocks for the matrices account for less than half of the RAM blocks consumed by the component.
If you look further down the report, you might see that many RAM blocks are consumed by Cluster logic or State variable. You might also see that some of your array values that you intended to be stored in registers were instead stored in large numbers of RAM blocks.
Notice the number of RAM blocks that are consumed by Cluster Logic and State.
- Pipeline loops instead of unrolling them.
- Storing local variables in local RAM blocks (hls_memory memory attribute) instead of large registers (hls_register memory attribute).