Visible to Intel only — GUID: loq1678993625701
Ixiasoft
Visible to Intel only — GUID: loq1678993625701
Ixiasoft
8. Streaming-to-Memory (S2M) Streaming Demonstration
A typical use case of the Intel® FPGA AI Suite IP is to run inferences on live input data. For instance, live data can come from a video source such as an HDMI IP core and stream to the Intel® FPGA AI Suite IP to perform image classification on each frame.
- streaming_inference_app
This application loads and runs a network and captures the results.
- image_streaming_app
This application loads bitmap files from a folder on the SD card and continuously sends the images to the EMIF, simulating a running video source
There is a module called the stream controller that runs on a Nios® V microcontroller that controls the scheduling of the source images to the Intel® FPGA AI Suite IP.
The streaming_inference_app application creates OpenVINO™ inference requests. Each inference request is allocated memory on the EMIF for input and output buffers. This information is sent to the stream controller when the inference requests are submitted for asynchronous execution.
In its running state, the stream controller waits for input buffers to arrive from the image_streaming_app application. When the buffer arrives, the stream controller programs the Intel® FPGA AI Suite IP with the details of the received input buffer, which triggers the Intel® FPGA AI Suite IP to run an inference.
When an inference is complete, a completion count register is incremented within the Intel® FPGA AI Suite IP CSRs. This counter is monitored by the currently executing inference request in the streaming_inference_app application, and is marked as complete when the increment is detected. The output buffer is then fetched from the EMIF and the Intel® FPGA AI Suite IP portion of the inference is now complete.
Depending on the model used, there might be further processing of the output by the OpenVINO™ HETERO plugin and OpenVINO™ Arm* CPU plugin. After the complete network has finished processing, a callback is made to the application to indicate the inference is complete.
The application performs some post processing on the buffer to generate the results and then resubmits the same inference request back to OpenVINO, which lets the stream controller use the same input/output memory block again.