Run OpenVINO™ Benchmarking Tool

Edge Insights for Autonomous Mobile Robots (EI for AMR) Developer Guide

Download PDF

ID 767160

Date 7/18/2022

Version 2022.2

Public

Visible to Intel only — GUID: GUID-0447A5ED-08D8-4E79-849A-E90A6C6BE042

View Details

Run OpenVINO™ Benchmarking Tool

This tutorial tells you how to run the benchmark application on an 11th Generation Intel® Core™ processor with an integrated GPU. It uses the asynchronous mode to estimate deep learning inference engine performance and latency.

Start Docker* Container

Go to the AMR_containers folder:

cd <edge_insights_for_amr_path>/Edge_Insights_for_Autonomous_Mobile_Robots_<version>/AMR_containers

Start the Docker container as root:

./run_interactive_docker.sh eiforamr-full-flavour-sdk:<TAG> root

Set Environment Variables

The environment variables must be set before you can compile and run OpenVINO™ applications.

Run the following script:

source /opt/intel/openvino/bin/setupvars.sh
             --or--
source <OPENVINO_INSTALL_DIR>/bin/setupvars.sh

Build Benchmark Application

Change directory and build the benchmark application using the cmake script file using the following commands:
```
cd /opt/intel/openvino/inference_engine/samples/cpp
./build_samples.sh
```
Once the build is successful, access the benchmark application in the following directory:
```
cd /root/inference_engine_cpp_samples_build/intel64/Release
       -- or --
cd <INSTALL_DIR>/inference_engine_cpp_samples_build/intel64/Release
```
The benchmark_app application is available inside the Release folder.

Input File

Select an image file or a sample video file to provide an input to the benchmark application from the following directory:

cd /root/inference_engine_cpp_samples_build/intel64/Release

Application Syntax and Options

The benchmark application syntax is as follows:

./benchmark_app [OPTION]

In this tutorial, we recommend you select the following options:

./benchmark_app -m <model> -i <input> -d <device> -nireq <num_reqs> -nthreads <num_threads> -b <batch>


where:
<model>-------------The complete path to the model .xml file
<input>-------------The path to the folder containing image or sample video file.
<device>------------The device type can be GPU or CPU etc.,
<num_reqs>----------No of parallel inference requests
<num_threads>-------No of threads to use for inference on the CPU (throughput mode)
<batch>-------------Batch size

For complete details on the available options, run the following command:

./benchmark_app -h

Run the Application

The benchmark application is executed as seen below. This tutorial uses the following settings:

Benchmark application is executed on frozen_inference_graph model.
Number of parallel inference requests is set as 8.
Number of CPU threads to use for inference is set as 8.
Device type is GPU.

./benchmark_app -d GPU -i ~/<dir>/input/ -m /home/eiforamr/workspace/object_detection/src/object_detection/models/ssd_mobilenet_v2_coco/frozen_inference_graph.xml -nireq 8 -nthreads 8
./benchmark_app -d GPU -i /home/eiforamr/data_samples/media_samples/plates_720.mp4 -m /home/eiforamr/workspace/object_detection/src/object_detection/models/ssd_mobilenet_v2_coco/frozen_inference_graph.xml -nireq 8 -nthreads 8

Expected output:

[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ]     /home/eiforamr/data_samples/media_samples/plates_720.mp4
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
        API version ............ 2.1
        Build .................. 2021.2.0-1877-176bdf51370-releases/2021/2
        Description ....... API
[ INFO ] Device info:
        GPU
        clDNNPlugin version ......... 2.1
        Build ........... 2021.2.0-1877-176bdf51370-releases/2021/2


[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for GPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Loading network files
[ INFO ] Read network took 89.49 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 44714.68 ms
[Step 8/11] Setting optimal runtime parameters
[Step 9/11] Creating infer requests and filling input blobs with images
[ INFO ] Network input 'image_tensor' precision U8, dimensions (NCHW): 1 3 300 300
[ WARNING ] No supported image inputs found! Please check your file extensions: bmp, dib, jpeg, jpg, jpe, jp2, png, pbm, pgm, ppm, sr, ras, tiff, tif
[ INFO ] Infer Request 0 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 1 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 2 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 3 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 4 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 5 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 6 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[ INFO ] Infer Request 7 filling
[ INFO ] Fill input 'image_tensor' with random values (image is expected)
[Step 10/11] Measuring performance (Start inference asynchronously, 8 inference requests using 2 streams for GPU, limits: 60000 ms duration)
[ INFO ] First inference took 10.01 ms


[Step 11/11] Dumping statistics report
Count:      9456 iterations
Duration:   60066.11 ms
Latency:    51.33 ms
Throughput: 157.43 FPS

Benchmark Report

Sample execution results using an 11th Gen Intel® Core™ i7-1185GRE @ 2.80 GHz.

Read network time (ms)	89
Load network time (ms)	44714.68
First inference time (ms)	10.01
Total execution time (ms)	60066.11
Total num of iterations	9456
Latency (ms)	51.33
Throughput (FPS)	157.43

NOTE:

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. No product or component can be absolutely secure. Performance varies by use, configuration and other factors. Learn more at Intel® Performance Index.

Troubleshooting

For general robot issues, go to: Troubleshooting for Robot Tutorials.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Edge Insights for Autonomous Mobile Robots (EI for AMR) Developer Guide

Run OpenVINO™ Benchmarking Tool

Start Docker* Container

Set Environment Variables

Build Benchmark Application

Input File

Application Syntax and Options

Run the Application

Benchmark Report

Troubleshooting