Visible to Intel only — GUID: GUID-255623E0-2065-43FF-8EBA-76527D44B690
Visible to Intel only — GUID: GUID-255623E0-2065-43FF-8EBA-76527D44B690
Area Estimates
The <project_dir>/reports/report.html file contains information about area use of your DPC++ system.
The report provides the following information:
- Detailed area breakdown of the whole DPC++ system, mapped to your source code where possible.
- Architectural details to give insight into the generated hardware and offers actionable suggestions to resolve potential inefficiencies.
To view the Area Estimates report, click Area Estimates.
As you can observe in the following figure, the report is divided into three levels of hierarchy:
- System area: Used by all kernels, pipes, interconnects, and board logic.
- Kernel area: Used by a specific kernel, including overheads, for example, dispatch logic.
- Block area: Used by a specific block within a kernel. A block represents a branch-free section of your source code (for example, a loop body). To view the area, use information from the source code lines associated with a block and expand the report entry for that block.
The area use data are estimates that the Intel® oneAPI DPC++/C++ Compiler generates. These estimates might differ from the final area utilization results.
Messages in the Area Estimates Report
After you compile your DPC++ application, review the Area Estimates report that the Intel® oneAPI DPC++/C++ Compiler generates. In addition to summarizing the application’s resource use, the Area Estimates report offers suggestions on modifying your design to improve efficiency. Refer to the following sections that describe various messages reported in the Area Estimates report.
The Area Estimates report identifies the amount of logic that the Intel® oneAPI DPC++/C++ Compiler generates for the Custom Platform or board interface. The board interface is the static region of the device that facilitates communication with external interfaces such as PCIe®. The Custom Platform specifies the size of the board interface.
The Area Estimates report identifies the amount of logic that the Intel® oneAPI DPC++/C++ Compiler generates for dispatching kernels.
The Area Estimates report identifies the number of resources your design uses for live values and control logic. To reduce the reported area consumption under State, modify your design as follows:
- Decrease the size of local variables.
- Decrease the scope of local variables by localizing them whenever possible.
- Decrease the number of nested loops in the kernel.
The Area Estimates report specifies the resources that your design uses for loop-carried dependencies.
To reduce the reported area consumption under Feedback, decrease the number and size of loop-carried variables in your design.
The Area Estimates report provides information on the implementation of private memory based on your DPC++ design. For single work-item kernels, the Intel® oneAPI DPC++/C++ Compiler implements private memory differently, depending on the types of variable. The Intel® oneAPI DPC++/C++ Compiler implements scalars and small arrays in registers of various configurations (for example, plain registers, shift registers, and barrel shifter). The Intel® oneAPI DPC++/C++ Compiler implements larger arrays in block RAM.
The following table lists messages and notes of different private variable storage types:
Message |
Notes |
---|---|
Implementation of Private Memory Using On-Chip Block RAM |
|
Private memory implemented in on-chip block RAM. |
The block RAM implementation creates a system that is similar to local memory for NDRange kernels. |
Implementation of Private Memory Using On-Chip Block ROM |
|
— |
For each use of an on-chip block ROM, the Intel® oneAPI DPC++/C++ Compiler creates another instance of the same ROM. There is no explicit annotation for private variables that the Intel® oneAPI DPC++/C++ Compiler implements in on-chip block ROM. |
Implementation of Private Memory Using Registers |
|
Implemented using registers of the following size:
|
Reports that the Intel® oneAPI DPC++/C++ Compiler implements a private variable in registers. The Intel® oneAPI DPC++/C++ Compiler might implement a private variable in many registers. This message provides a list of the registers with their specific widths and depths. |
Implementation of Private Memory Using Shift Registers |
|
Implemented as a shift register with <N> or fewer tap points. This is a very efficient storage type. Implemented using registers of the following sizes:
|
Reports that the Intel® oneAPI DPC++/C++ Compiler implements a private variable in shift registers. This message provides a list of shift registers with their specific widths and depths. The Intel® oneAPI DPC++/C++ Compiler might break a single array into several smaller shift registers depending on its tap points.
NOTE:
The compiler might overestimate the number of tap points. |
Implementation of Private Memory Using Barrel Shifters with Registers |
|
Implemented as a barrel shifter with registers due to dynamic indexing. This is a high overhead storage type. If possible, change to compile-time known indexing. The area cost of accessing this variable is shown on the lines where the accesses occur. Implemented using registers of the following size:
|
Reports that the Intel® oneAPI DPC++/C++ Compiler implements a private variable in a barrel shifter with registers because of dynamic indexing. This row in the report does not specify the full area use of the private variable. The report shows additional area use information on the lines where the variable is accessed. |
- The Area Estimates report annotates memory information on the line of code that declares or uses private memory, depending on its implementation.
- When the Intel® oneAPI DPC++/C++ Compiler implements private memory in on-chip block RAM, the Area Estimates report displays relevant local-memory-specific messages to private memory systems.