Performance Data for Intel® AI Data Center Products
Find the latest AI benchmark performance data for Intel Data Center products, including detailed hardware and software configurations.
Pretrained models, sample scripts, best practices, and tutorials
- Intel® Developer Cloud
- Intel® AI Reference Models and Jupyter Notebooks*
- AI-Optimized CPU Containers from Intel
- AI-Optimized GPU Containers from Intel
- Open Model Zoo for OpenVINO™ toolkit
- Jupyter Notebook tutorials for OpenVINO™
- AI Performance Debugging on Intel® CPUs
Measurements were taken using:
- PyTorch* Optimizations from Intel
- TensorFlow* Optimizations from Intel
- Intel® Distribution of OpenVINO™ Toolkit
5th Generation Intel® Xeon® Scalable Processors
Intel® Xeon® Platinum 8592+ Processor (64 Cores)
Inference
Framework Version | Model | Usage | Precision | Throughput | Perf/Watt | Latency(ms) | Batch size |
---|---|---|---|---|---|---|---|
Intel PyTorch 2.1 DeepSpeed | GPT-J 6B Token size 1024/128 | text-generation, Beam Search, Width=4 | int8 | 35 | 1 | ||
Intel PyTorch 2.1 DeepSpeed | GPT-J 6B Token size 1024/128 | text-generation, Beam Search, Width=4 | int8 | 173 tokens/s | 92.5 | 8 | |
Intel PyTorch 2.1 DeepSpeed | GPT-J 6B Token size 1024/128 | text-generation, Beam Search, Width=4 | bf16 | 52.5 | 1 | ||
Intel PyTorch 2.1 DeepSpeed | GPT-J 6B Token size 1024/128 | text-generation, Beam Search, Width=4 | bf16 | 98.5 | 8 | ||
Intel PyTorch 2.2 MLPerf v4.0 | GPT-J (MLPerf v4.0, offline, 99.0% acc) | CNN-DailyMail News Text Summarization (input 13,368) | int4 | 3.61 samp/s | 8 | ||
Intel PyTorch 2.1 DeepSpeed | LLaMA2-7B Token size 1024/128 | text-generation, Beam Search, Width=4 | int8 | 41.5 | 1 | ||
Intel PyTorch 2.1 DeepSpeed | LLaMA2-7B Token size 1024/128 | text-generation, Beam Search, Width=4 | int8 | 149.5 tokens/s | 107 | 8 | |
Intel PyTorch 2.1 DeepSpeed | LLaMA2-7B Token size 1024/128 | text-generation, Beam Search, Width=4 | bf16 | 59.5 | 1 | ||
Intel PyTorch 2.1 DeepSpeed | LLaMA2-7B Token size 1024/128 | text-generation, Beam Search, Width=4 | bf16 | 142.2 tokens /s | 112.5 | 8 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 32/512 | GenAI_chat | Int4 | 11.3 tokens/s | 88.44 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 32/512 | GenAI_chat | int8 | 13.5 tokens/s | 73.74 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 32/512 | GenAI_chat | fp32 | 11.3 tokens/s | 88.39 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 80/512 | GenAI_chat | Int4 | 11.4 tokens/s | 87.17 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 80/512 | GenAI_chat | int8 | 13.6 tokens/s | 73.09 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 80/512 | GenAI_chat | fp32 | 11.2 tokens/s | 89.00 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 142/512 | GenAI_chat | Int4 | 11.5 tokens/s | 86.63 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 142/512 | GenAI_chat | int8 | 13.3 tokens/s | 75.15 | 1 | |
OpenVINO 2023.2 | LLaMA2-7b Token size 142/512 | GenAI_chat | fp32 | 11.1 tokens/s | 89.73 | 1 | |
OpenVINO 2023.2 | Stable Diffusion 2.1, 20 Steps, 64 Prompts | GenAI_text_image | int8 | 0.24 img/s | 4,160 | 1 | |
MLPerf Inference v4.0 | Stable Diffusion XL (offline) | Image Generation | bf16 | 0.19 samp/s | 8 | ||
MLPerf Inference v4.0 | ResNet50 v1.5 (offline) | Image Recognition | int8 | 25,289.6 samp/s | 256 | ||
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | Int8 | 12,862.56 img/s | 13.23 | 1 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | Int8 | 19,386.47 img/s | 19.21 | 64 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf16 | 8,211.8 img/s | 8.13 | 1 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf16 | 10,187.87 img/s | 10.82 | 64 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | fp32 | 1,773.68 img/s | 1.74 | 1 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | fp32 | 1,703.77 img/s | 1.57 | 64 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf32 | 2,431.26 img/s | 2.40 | 1 | |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf32 | 2,686.97 img/s | 2.67 | 64 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | Int8 | 9,726.18 img/s | 9.67 | 1 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | Int8 | 16,036.8 img/s | 17.01 | 32 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | bf16 | 6,782.09 img/s | 7.04 | 1 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | bf16 | 9,312.72 img/s | 9.40 | 32 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | fp32 | 1,560.99 img/s | 1.45 | 1 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | fp32 | 1,663.44 img/s | 1.57 | 32 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | bf32 | 2,013.88 img/s | 1.84 | 1 | |
Intel TensorFlow 2.14 | ResNet50 v1.5 | Image Recognition | bf32 | 2,874.29 img/s | 2.73 | 32 | |
OpenVINO 2023.2 | ResNet50 v1.5 | Image Recognition | Int8 | 18,674.37 img/s | 26.68 | 1 | |
OpenVINO 2023.2 | ResNet50 v1.5 | Image Recognition | bf16 | 11,537.06 img/s | 16.48 | 1 | |
OpenVINO 2023.2 | ResNet50 v1.5 | Image Recognition | fp32 | 1,721.58 img/s | 2.46 | 1 | |
MLPerf Inference v4.0 | BERT-Large (offline, 99.0% acc) | Natural Language Processing | int8 | 1,668.5 samp/s | 1,300 | ||
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | int8 | 411.14 sent/s | 0.42 | 1 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | int8 | 455.33 sent/s | 0.45 | 16 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | bf16 | 243.89 sent/s | 0.24 | 1 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | bf16 | 278.00 sent/s | 0.25 | 44 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | fp32 | 44.56 sent/s | 0.04 | 1 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | fp32 | 50.49 sent/s | 0.05 | 16 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | bf32 | 98.49 sent/s | 0.09 | 1 | |
Intel PyTorch 2.1 | BERTLarge | Natural Language Processing | bf32 | 96.98 sent/s | 0.09 | 16 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | int8 | 323.58 sent/s | 0.32 | 1 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | int8 | 324.56 sent/s | 0.33 | 12 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | bf16 | 224.04 sent/s | 0.22 | 1 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | bf16 | 231.37 sent/s | 0.23 | 28 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | fp32 | 55.34 sent/s | 0.05 | 1 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | fp32 | 48.46 sent/s | 0.05 | 12 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | bf32 | 101.93 sent/s | 0.10 | 1 | |
Intel TensorFlow 2.14 | BERTLarge | Natural Language Processing | bf32 | 98.81 sent/s | 0.10 | 12 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | int8 | 373.6867 sent/s | 0.37 | 1 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | int8 | 388.25 sent/s | 0.39 | 32 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | bf16 | 244.25 sent/s | 0.24 | 1 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | bf16 | 281.79 sent/s | 0.27 | 40 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | fp32 | 57.16667 sent/s | 0.06 | 1 | |
OpenVINO 2023.2 | BERTLarge | Natural Language Processing | fp32 | 55.67 sent/s | 0.05 | 16 | |
Intel PyTorch 2.1 | DLRM Criteo Terabyte | Recommender | int8 | 23,444,587 rec/s | 23611.92 | 128 | |
Intel PyTorch 2.1 | DLRM Criteo Terabyte | Recommender | bf16 | 13,223,343 rec/s | 12742.32 | 128 | |
Intel PyTorch 2.1 | DLRM Criteo Terabyte | Recommender | fp32 | 2,742,037 rec/s | 2615.42 | 128 | |
Intel PyTorch 2.1 | DLRM Criteo Terabyte | Recommender | bf32 | 6,760,005 rec/s | 6699.18 | 128 | |
MLPerf Inference v4.0 | DLRM-v2 (offline, 99.9% acc) | Recommender | int8 | 9,111.08 samp/s | 400 | ||
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | int8 | 6,380.26 sent/s | 6.80 | 1 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | int8 | 10,701.44 sent/s | 11.39 | 104 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | bf16 | 4,651.69 sent/s | 4.97 | 1 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | bf16 | 6,864.75 sent/s | 7.23 | 88 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | fp32 | 1,121.45 sent/s | 1.12 | 1 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | fp32 | 1,205.86 sent/s | 1.27 | 32 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | bf32 | 2,161.93 sent/s | 2.15 | 1 | |
Intel PyTorch 2.1 | DistilBERT | Natural Language Processing | bf32 | 2,584.98 sent/s | 2.63 | 56 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | int8 | 77.94 sent/s | 0.07 | 1 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | int8 | 334.65 sent/s | 0.31 | 448 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf16 | 52 sent/s | 0.05 | 1 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf16 | 367.07 sent/s | 0.35 | 448 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | fp32 | 1,099.6 sent/s | 26.53 | 1 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | fp32 | 137.37 sent/s | 0.12 | 448 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf32 | 24.86 sent/s | 0.02 | 1 | |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf32 | 155.04 sent/s | 0.14 | 448 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | int8 | 30.31 samples/s | 0.03 | 1 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | int8 | 27.18333 samples/s | 0.02 | 6 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | bf16 | 15.67667 samples/s | 0.01 | 1 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | bf16 | 3.18 samples/s | 0.00 | 7 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | fp32 | 3.49 samples/s | 0.00 | 1 | |
OpenVINO 2023.2 | 3D-Unet | Image Segmentation | fp32 | 14.40 samples/s | 0.01 | 3 | |
OpenVINO 2023.2 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | int8 | 590.2267 img/s | 0.57 | 1 | |
OpenVINO 2023.2 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf16 | 297.79 img/s | 0.28 | 1 | |
OpenVINO 2023.2 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | fp32 | 36.92 img/s | 0.04 | 1 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 1,679.87 fps | 1.73 | 1 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | int8 | 2,481.66 fps | 2.56 | 58 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 802.44 fps | 0.80 | 1 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | bf16 | 1,175.18 fps | 1.10 | 72 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 186.33 fps | 0.19 | 1 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | fp32 | 202.33 fps | 0.19 | 40 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | bf32 | 279.07 fps | 0.28 | 1 | |
Intel PyTorch 2.1 | ResNeXt101 32x16d ImageNet | Image Classification | bf32 | 320.62 fps | 0.29 | 58 | |
OpenVINO 2023.2 | Yolo-v8n | Object Detection | Int8 | 3,513.54 img/s | 1 | ||
OpenVINO 2023.2 | Yolo-v8n | Object Detection | bf16 | 3,632.55 img/s | 1 | ||
OpenVINO 2023.2 | Yolov-8n | Object Detection | fp32 | 1,249.91 img/s | 1 | ||
MLPerf Inference v4.0 | RetinaNet (offline) | Object Detection | int8 | 371.08 samp/s | 2 | ||
MLPerf Inference v4.0 | RNN-T (offline) | Speech-to-text | int8+bf16 | 8,679.48 samp/s | 256 |
Training
Transfer Learning / Fine Tuning
Framework Version | Model | Usage | Precision | TTT (minutes) | Accuracy | Batch Size | Ranks |
---|---|---|---|---|---|---|---|
Transformers 4.31, Intel Extension for Pytorch 2.0.1, PEFT 0.4.0 | GPT-J 6B (Glue MNLI dataset) | Fine-tuning, Text-generation | bf16 | 184.20 | 82.2 | 8 | 1 |
Transformers 4.34.1, Intel PyTorch 2.1.0, PEFT 0.5.0, Intel(r) oneCCL v2.1.0 | BioGPT 1.5B (PubMedQA dataset) | Response generation | bf16 | 39.80 | 79.4 | 8 | 8 |
Intel(r) Tensorflow 2.14, horovod 0.28, Open MPI 4.1.2, Python 3.10.0 | ResNet50 v1.50 (Colorectal histology dataset) | Colorectal cancer detection | fp32 | 6.98 | 94.1 | 32 | 64 |
Intel(r) Tensorflow 2.14, horovod 0.28, Open MPI 4.1.2, Python 3.10.0 | ResNet50 v1.50 (Colorectal histology dataset) | Colorectal cancer detection | bf16 | 4.08 | 94.9 | 32 | 64 |
Intel(r) Tensorflow 2.14, horovod 0.28, Open MPI 4.1.2, Python 3.10.0 | ResNet50 v1.50 (Colorectal histology dataset) | Colorectal cancer detection | fp32 | 5.34 | 94.1 | 32 | 128 |
Intel(r) Tensorflow 2.14, horovod 0.28, Open MPI 4.1.2, Python 3.10.0 | ResNet50 v1.50 (Colorectal histology dataset) | Colorectal cancer detection | bf16 | 2.90 | 94.9 | 32 | 128 |
Transformers 4.35.0, Intel PyTorch 2.0.100, Intel® oneCCL 2.0.100 | BERTLarge Uncased (IMDb dataset) | Sentiment Analysis | fp32 | 47.95 | 93.84 | 64 | 4 |
Transformers 4.35.0, Intel PyTorch 2.0.100, Intel® oneCCL 2.0.100 | BERTLarge Uncased (IMDb dataset) | Sentiment Analysis | bf16 | 15.96 | 93.8 | 64 | 4 |
Transformers 4.35.0, Intel PyTorch 2.0.100, Intel® oneCCL 2.0.100 | BERTLarge Uncased (GLUE SST2 dataset) | Sentiment Analysis | fp32 | 10.48 | 92.2 | 256 | 4 |
Transformers 4.35.0, Intel PyTorch 2.0.100, Intel® oneCCL 2.0.100 | BERTLarge Uncased (GLUE SST2 dataset) | Sentiment Analysis | bf16 | 2.93 | 92.09 | 256 | 4 |
Training Throughput
Framework Version | Model/Dataset | Usage | Precision | Throughput | Perf/Watt | Batch size |
---|---|---|---|---|---|---|
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | fp32 | 175.29 img/s | 0.22 | 128 |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf16 | 396.24 img/s | 0.52 | 256 |
Intel PyTorch 2.1 | ResNet50 v1.5 | Image Recognition | bf32 | 197.14 img/s | 0.25 | 128 |
Intel TensorFlow 2.14 | ResNet50 v1.5 ImageNet (224 x224) | Image Recognition | fp32 | 145.93 img/s | 0.19 | 512 |
Intel TensorFlow 2.14 | ResNet50 v1.5 ImageNet (224 x224) | Image Recognition | bf16 | 354.45 img/s | 0.46 | 512 |
Intel TensorFlow 2.14 | ResNet50 v1.5 ImageNet (224 x224) | Image Recognition | bf32 | 166.37 img/s | 0.21 | 512 |
Intel PyTorch 2.1 | DLRM Criteo Terabyte, QUAD Mode | Recommender | fp32 | 290,772.24 rec/s | 359.83 | 32,768 |
Intel PyTorch 2.1 | DLRM Criteo Terabyte, QUAD Mode | Recommender | bf16 | 862,286.46 rec/s | 1,055.35 | 32,768 |
Intel PyTorch 2.1 | DLRM Criteo Terabyte, QUAD Mode | Recommender | bf32 | 417,584.33 rec/s | 504.29 | 32,768 |
Intel TensorFlow 2.14 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | fp32 | 61.25 img/s | 0.09 | 448 |
Intel TensorFlow 2.14 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf16 | 219.77 img/s | 0.31 | 448 |
Intel TensorFlow 2.14 | SSD-ResNet34 COCO 2017 (1200 x1200) | Object Detection | bf32 | 83.44 img/s | 0.11 | 448 |
Intel PyTorch 2.1 | RNNT LibriSpeech | Speech Recognition | fp32 | 4.35 fps | 0.01 | 64 |
Intel PyTorch 2.1 | RNNT LibriSpeech | Speech Recognition | bf16 | 35.13 fps | 0.04 | 64 |
Intel PyTorch 2.1 | RNNT LibriSpeech | Speech Recognition | bf32 | 13.65 fps | 0.02 | 32 |
Intel PyTorch 2.1 | MaskR-CNN COCO 2017 | Object Detection | fp32 | 4.8 img/s | 0.01 | 128 |
Intel PyTorch 2.1 | MaskR-CNN COCO 2017 | Object Detection | bf16 | 16.43 img/s | 0.02 | 128 |
Intel PyTorch 2.1 | MaskR-CNN COCO 2017 | Object Detection | bf32 | 5.37 img/s | 0.01 | 96 |
Intel PyTorch 2.1 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | fp32 | 4.41 sent/s | 0.01 | 64 |
Intel PyTorch 2.1 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | bf16 | 12.53 sent/s | 0.02 | 28 |
Intel PyTorch 2.1 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | bf32 | 5.52 sent/s | 0.01 | 56 |
Intel TensorFlow 2.14 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | fp32 | 5.38 sent/s | 0.01 | 64 |
Intel TensorFlow 2.14 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | bf16 | 11.74 sent/s | 0.02 | 64 |
Intel TensorFlow 2.14 | BERTLarge Wikipedia 2020/01/01 seq len=512 | Natural Language Processing | bf32 | 6.07 sent/s | 0.01 | 64 |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | fp32 | 15,671.55 sent/s | 16.95 | 42,000 |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf16 | 40,653.1 sent/s | 43.77 | 42,000 |
Intel TensorFlow 2.14 | Transformer MLPerf | Language Translation | bf32 | 15,316.08 sent/s | 15.44 | 42,000 |
Hardware and software configuration (measured October 24, 2023):
Deep learning configuration:
- Hardware configuration for Intel® Xeon® Platinum 8592+ processor (formerly code named Emerald Rapids): 2 sockets for inference, 1 socket for training, 64 cores, 350 watts, 1024GB 16 x 64GB DDR5 5600 MT/s memory, operating system CentOS* Stream 9. Using Intel® Advanced Matrix Extensions (Intel® AMX) int8 and bf16 with Intel® oneAPI Deep Neural Network Library (oneDNN) optimized kernels integrated into Intel® Extension for PyTorch*, Intel® Extension for TensorFlow*, and Intel® Distribution of OpenVINO™ toolkit. Measurements may vary. If the dataset is not listed, a synthetic dataset was used to measure performance.
Transfer learning configuration:
- Hardware configuration for Intel® Xeon® Platinum 8592+ processor (formerly code named Emerald Rapids): 2 sockets, 64 cores, 350 watts, 16 x 64 GB DDR5 5600 memory, BIOS version 3B05.TEL4P1, operating system: CentOS stream 8, using Intel® Advanced Matrix Extensions (Intel® AMX) int8 and bf16 with Intel® oneAPI Deep Neural Network Library (oneDNN) v2.6.0 optimized kernels integrated into Intel® Extension for PyTorch* v2.0.1, Intel® Extension for TensorFlow* v2.14, and Intel® oneAPI Data Analytics Library (oneDAL) 2023.1 optimized kernels integrated into Intel® Extension for Scikit-learn* v2023.1. Intel® Distribution of Modin* v2.1.1, and Intel oneAPI Math Kernel Library (oneMKL) v2023.1. Measurements may vary.
MLPerf* configuration:
- Hardware configuration for MLPerf* Inference v4.0 measurements on Intel® Xeon® Platinum 8592+ processor (formerly code named Emerald Rapids): 2 sockets for inference, 64 cores, 350 watts, 1024 GB 16 x 64 GB DDR5-5600 MT/s memory, operating system: CentOS* Stream 8. Using Intel® Advanced Matrix Extensions (Intel® AMX) int4, int8, and bf16 with Intel® oneAPI Deep Neural Network Library (oneDNN) optimized kernels integrated into Intel® Extension for PyTorch*. Measurements may vary. The model specifications and datasets used for MLPerf workloads are specified by MLCommons and viewable at MLPerf Inference: Datacenter Benchmark Suite Results.