FPGA AI Inference Development Flow
The development flow seamlessly combines a hardware and software workflow into a generic end-to-end AI workflow. The steps are as follows:
1. Model Optimizer in the OpenVINO toolkit creates intermediate representation network files (.xml) and weights and biases files (.bin).
2. Intel FPGA AI Suite compiler is used to:
- Provide estimated area or performance metrics for a given architecture file or produce an optimized architecture file. (Architecture refers to inference IP parameters such as size of PE array, precisions, activation functions, interface widths, window sizes, etc.)
- Compile network files into a .bin file with network partitions for FPGA and CPU (or both) along with weights and biases.
3. The compiled .bin file is imported by the user inference application at runtime.
- Runtime application programming interfaces (APIs) include Inference Engine API (runtime partition CPU and FPGA, schedule inference) and FPGA AI (DDR memory, FPGA hardware blocks).
- Reference designs demonstrate the basic operations of importing .bin and running inference on FPGA with supporting host CPUs (x86 and Arm processors).
Notes:
Devices supported: Agilex™ 7 FPGA, Cyclone® 10 GX FPGA, Arria® 10 FPGA
Tested networks, layers, and activation functions1:
- ResNet-50, MobileNet v1/v2/v3, YOLO v3, TinyYOLO v3, UNET, i3d
- 2D Conv, 3D Conv, Fully Connected, Softmax, BatchNorm, EltWise Mult, Clamp
- ReLU, PReLU