FPGA AI Suite: Getting Started Guide

ID 768970
Date 12/16/2024
Public
Document Table of Contents

7. Running the Hostless DDR-Free Design Example

The FPGA AI Suite provides a design example to demonstrate hostless and DDR-free operation of the FPGA AI Suite IP. Graph filters, bias, and FPGA AI Suite IP configurations are stored in on-chip memory on the FPGA device instead of DDR memory on the board.

For more details about DDR-free operation, refer to DDR-Free Operation in the FPGA AI Suite IP Reference Manual .

Hardware Requirements

This design example requires the following hardware:

Software Requirements

This design example requires the following software:
  • FPGA AI Suite
  • Quartus® Prime Programmer (either standalone or as part of Quartus® Prime Design Suite).
  • Quartus® Prime System Console (either standalone or as part of Quartus® Prime Design Suite).

Procedure

To run the hostless DDR-free design example with a ResNet-18 PyTorch Model:
  1. Download and prepare the ResNet-18 PyTorch Model with the OpenVINO™ Model Optimizer with the following commands:
    source ~/build-openvino-dev/openvino_env/bin/activate
    
    omz_downloader --name resnet-18-pytorch \
        --output_dir $COREDLA_WORK/demo/models/
    
    omz_converter --name resnet-18-pytorch \
        --download_dir $COREDLA_WORK/demo/models/ \
        --output_dir $COREDLA_WORK/demo/models/
    Important: The OpenVINO™ Open Model Zoo (OMZ) PyTorch models do not include a softmax operation at the end of the model.
  2. Generate the parameter ROMs as .mif files by running the FPGA AI Suite compiler with the following command:
    dlac \
      --batch-size=1 \
      --network-file <path/to/graph> \
      --march $COREDLA_ROOT/example_architectures/AGX7_Streaming_Ddrfree_Resnet18.arch \
      --foutput-format=open_vino_hetero \
      --o <compiler output .bin file name>  \
      --fplugin HETERO:FPGA \
      --dumpdir $COREDLA_ROOT/resnet-18-dlac-out/
          

    The .mif files are created a directory called parameter_rom in the folder specified by the ‑‑dumpdir option.

    For details about creating the .mif files required for DDR-free operation, refer to "Generating Artifacts for DDR-Free Operation" in the FPGA AI Suite Compiler Reference Manual .

  3. Build the example design with the following command:
    dla_build_example_design.py \
      -n 1 \
      --arch=$COREDLA_ROOT/example_architectures/AGX7_Streaming_Ddrfree_Resnet18.arch \
      --build --build-dir=<path/to/build/dir> \
      --example-design-id=0_STREAMING \
      --seed=1 \
      --parameter-rom-dir $COREDLA_ROOT/resnet-18-dlac-out/parameter_rom/

    Building the example design creates the bitstream needed to program the FPGA device.

    For more information about the dla_build_example_design command, refer to "Build Script" in FPGA AI Suite PCIe-based Design Example User Guide .

  4. Program the FPGA device with the Quartus® Prime Programmer.

    The bitstream used to program the device is <path/to/build/dir>/hw/output_files/top.sof.

    Program the FPGA device with the following command:
    quartus_pgm -c 1 -m jtag -o "p;top.sof@1"

    For more information about the Quartus® Prime Programmer, refer to Quartus® Prime Pro Edition User Guide: Programmer .

  5. Use the Quartus® Prime System Console to run inference on the example design.

    Because this example design is hostless, operations that typically come from the host are performed through Quartus® Prime System Console instead. For more information about the Quartus® Prime System Console, refer to "Analyzing and Debugging Designs with System Console" in Quartus® Prime Pro Edition User Guide: Debug Tools .

    Use the System Console to complete the following steps:
    1. Store input features in the FPGA on-chip memory.
    2. Prime the FPGA AI Suite IP registers for inference.
    3. Configure an ingress Modular Scatter-Gather DMA (mSGDMA) core to read the input features from on-chip memory and stream data into the FPGA AI Suite IP.
    4. Configure an egress mSGDMA core to stream data from the FPGA AI Suite IP into on-chip memory.
    5. Read the inference results from on-chip memory.

    The design example provides a System Console script to automate these operations for you. You can find the script in the $CORDLA_ROOT/runtime/stream/ed0_streaming_example folder.

    To use the design example System Console script:
    1. Run the following commands:
      cd <quartus-install-path>
      
      system-console –-script=system_console_script.tcl <path-to-img.bin> \
                     <#-of-inferences> <output-channels> <output-height> \
                     <output-width>

      The design example Quartus® Prime System Console script generates a file called output.bin that contains the raw inference results.

    2. (Optional) To measure the performance of the design example, run the following commands:
      cd <quartus-install-path>
      
      system-console –-script=system_console_perf.tcl <path-to-img.bin>
  6. Postprocess the raw inference output for readability with the following command:
    python3 $COREDLA_ROOT/bin/streaming_post_processing.py <path-to-output.bin>

    This script cleans the raw output binary file by script removing some invalid bytes and storing an FP16 formatted result_hw.txt file for readability.