5.5.4. The dla_benchmark Performance Metrics

FPGA AI Suite: PCIe-based Design Example User Guide

Download PDF

ID 768977

Date 7/31/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: flu1719407121128

Ixiasoft

View Details

5.5.4. The dla_benchmark Performance Metrics

The -save_run_summary option makes the dla_benchmark demonstration application collect performance metrics during inference. These metrics can help you determine how efficient an architecture is at executing a model.

Note: The dla_benchmark application provides throughput in "frames per second". The time per frame (latency) is 1/throughput.

Statistic	Description
Count	The number of times interference was performed. This is set by the `-niter` option.
System duration	The total time between when the first inference request was made to when the last request was finished, as measured by the host program.
IP duration	The total time the spent-on inference. This is reported by the IP on the FPGA.
Latency	The median time of all inference requests made by the host. This includes any overhead from OpenVINO™ or the FPGA AI Suite runtime.
System throughput	The total throughput of the system, including any OpenVINO™ or FPGA AI Suite runtime overhead.
Number of hardware instances	The number of IP instances on the FPGA.
Number of network instances	The number graphs that the IP processes in parallel.
IP throughput per instance	The throughput of a single IP instance. This is reported by the IP on the FPGA.
IP throughput per f_MAX per instance	The IP throughput per instance value scaled by the IP clock frequency value.
IP clock frequency	The clock frequency, as reported by the IP running on the FPGA device. The `dla_benchmark` application treats this value as the IP core f_MAX value.
Estimated IP throughput per instance	The estimated per-IP throughput, as estimated by the `dla_compiler` command with the `--fanalyze-performance` option.
Estimated IP throughput per fmax per instance	The Estimated IP throughput per instance value scaled by the compiler f_MAX estimate.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

FPGA AI Suite: PCIe-based Design Example User Guide

5.5.4. The dla_benchmark Performance Metrics