Optimized Intel® Reference Models for Intel® Data Center GPU Flex Series

ID 814741
Updated 12/3/2024
Version Latest
Public

Description 

This document provides links to step-by-step instructions on how to leverage reference model docker containers to run optimized open-source Deep Learning inference workloads using Intel® Extension for PyTorch* and Intel® Extension for TensorFlow* on the Intel® Data Center GPU Flex Series.

Base Containers

AI Framework Extension Documentation
PyTorch Intel® Extension for PyTorch* Intel® Extension for PyTorch Container
TensorFlow Intel® Extension for TensorFlow* Intel® Extension for TensorFlow Container

 

Optimized Workloads

The table below provides links to run each workload in a docker container. The containers were validated on a host running Linux*.

Model Framework Mode and Documentation
DistilBERT PyTorch FP16 and FP32 Inference
DLRM v1 PyTorch FP16 Inference
EfficientNet B0,B3,B4 TensorFlow FP16 Inference
EfficientNet PyTorch FP32 FP16 BF16 Inference
FastPitch PyTorch FP16 Inference
Mask R-CNN TensorFlow FP16 Inference
ResNet50 v1.5 PyTorch INT8 Inference
ResNet50 v1.5 TensorFlow INT8 Inference
SSD-MobileNet v1 TensorFlow INT8 Inference
Stable Diffusion PyTorch FP16 Inference 
Stable Diffusion TensorFlow FP32,FP16 Inference
UNet++ PyTorch FP16 Inference
Swin Transformer PyTorch FP16 Inference
Wide and Deep TensorFlow FP16 Inference
YOLO v5 PyTorch FP16 Inference

Note:

  • SSD-MobileNet v1 model is supported on older Intel® Extension for TensorFlow* v2.12 and Intel® Extension for PyTorch* 1.13.120+xpu versions.
  • The other models in the list are validated on Intel® Extension for TensorFlow* v2.14 and Intel® Extension for PyTorch* 2.1.10+xpu versions.