FPGA AI Suite: Getting Started Guide

ID 768970
Date 11/25/2024
Public
Document Table of Contents

6.7. Performing Inference on the PCIe-Based Example Design

Performing Inference Using JIT Mode

The JIT (just-in-time) mode causes the dla_benchmark demonstration application to call the dla_compiler command in a just-in-time way to compile the neural net graph.

If you do not have images and ground truth files, you can skip the optional -i and -groundtruth_loc parameter entries in the command that follows. If you skip these parameters, the dla_benchmark demonstration application generates randomized image data.

The value for $curarch must match the bitstream that you programmed in Programming the FPGA Device.

imagedir=$COREDLA_WORK/demo/sample_images
xmldir=$COREDLA_WORK/demo/models/public/ 
$COREDLA_WORK/runtime/build_Release/dla_benchmark/dla_benchmark \
    -b=1 \
    -m $xmldir/resnet-50-tf/FP32/resnet-50-tf.xml \
    -d=HETERO:FPGA,CPU \
    -niter=8 \
    -plugins $COREDLA_WORK/runtime/plugins.xml \
    -arch_file $curarch \
    -api=async \
    -perf_est \
    -nireq=4 \
    -bgr \
    -i $imagedir \
    -groundtruth_loc $imagedir/TF_ground_truth.txt

Performing Inference Using AOT Mode

In AOT (ahead-of-time) mode, the dla_benchmark demonstration application uses a compiled network that was previously produced by the dla_compiler compiler command when you followed the steps in Running the Graph Compiler.

To use AOT mode instead of JIT mode:

  1. Add the -cm argument to specify the name of the file containing the compiled network
  2. Remove the -perf_est flag. The dla_benchmark demonstration application does not estimate performance in AOT mode.

If you omit -i and -groundtruth_loc arguments, the dla_benchmark demonstration application generates random input data that is useful only for performance benchmarking.

gt_file=$COREDLA_WORK/demo/sample_images/TF_ground_truth.txt
$COREDLA_WORK/runtime/build_Release/dla_benchmark/dla_benchmark \
    -b=1 \
    -cm $COREDLA_WORK/demo/RN50_Performance_b1.bin \
    -d=HETERO:FPGA,CPU \
    -niter=8 \
    -plugins $COREDLA_WORK/runtime/plugins.xml \
    -arch_file $curarch \
    -api=async \
    -nireq=4 \
    -bgr \
    -i $COREDLA_WORK/demo/sample_images/ \
    -groundtruth_loc $gt_file

The -cm argument points to the .bin file that you created in Running the Graph Compiler.

Inference APIs

The easiest way to evaluate the ability of the FPGA AI Suite to perform inference is to use the dla_benchmark demonstration application that is included in the example runtime and is built as part of the steps described in Programming the FPGA Device.

The example runtime also includes instructions on how to use the OpenVINO™ Python API to execute inference using the JIT style described in Performing Inference Using JIT Mode.

These instructions are located in $COREDLA_WORK/runtime/python_demos/README.md.