Loading a Model on Vision Processing Unit (VPU) May take Longer than Loading on CPU
Content Type: Maintenance & Performance | Article ID: 000057722 | Last Reviewed: 01/29/2025
To reduce load time, load the model from Blob, which is a parsed graph, to bypass the model parsing stage.
There are two internal processes when loading a model on VPU:
During the loading process, the parsed VPU graphs are sent to the hardware, stage by stage, by xlink from the host.
Loading a model from a blob can reduce lots of time for some models, but it may not work for all models.
Besides model size, the loading time is dependent on layer type, input data size, and so on.
HDDL plugin is more efficient than MYRIAD plugin when loading model from Blob.
Follow these steps to enable the HDDL plugin instead of the MYRIAD plugin on the Intel® Neural Compute Stick 2: