5.5.4.1. Interpreting System Throughput and Latency Metrics

FPGA AI Suite: PCIe-based Design Example User Guide

Download PDF

ID 768977

Date 3/28/2025

Version

Public

Visible to Intel only — GUID: dvu1719407121409

Ixiasoft

View Details

5.5.4.1. Interpreting System Throughput and Latency Metrics

The System throughput and Latency metrics are measured by the host through the OpenVINO™ API. These measurements include any overhead that is incurred by both the API and the FPGA AI Suite runtime. They also account for any time spent waiting to make inference requests and the number of available instances.

In general, the system throughput is defined as follows:

$System Throughput = \frac{Batch Size \times Images per Batch}{Latency}$

The Batch Size and Images Per Batch values are set by the --batch-size and -niter options, respectively.

For example, consider when -nireq=1 and there is a single IP instance. The System throughput value is approximately the same as the IP-reported throughput value because the runtime can perform only one inference at a time. However, if both the -nireq and the number of IP instances is greater than one, the runtime can perform requests in parallel. As such, the total system throughput is greater than the individual IP throughput.

In general, the -nireq value should be twice the number of IP instances. This setting enables the FPGA AI Suite runtime to pipeline inferences requests, which allows the host to prepare the data for the next request while an IP instance is processing the previous request.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

FPGA AI Suite: PCIe-based Design Example User Guide

5.5.4.1. Interpreting System Throughput and Latency Metrics