Framework VersionModelUsagePrecisionThroughputPerf/WattLatency(ms)Batch sizeConfig*
Intel PyTorch 2.1 DeepSpeedGPT-J 6B Token size 1024/128text-generation, Beam Search, Width=4int8  4011 instance per socket
Intel PyTorch 2.1 DeepSpeedGPT-J 6B Token size 1024/128text-generation, Beam Search, Width=4int8130.4 tokens/s 9261 instance per socket
Intel PyTorch 2.1 DeepSpeedGPT-J 6B Token size 1024/128text-generation, Beam Search, Width=4bf16  59.511 instance per socket
Intel PyTorch 2.1 DeepSpeedGPT-J 6B Token size 1024/128text-generation, Beam Search, Width=4bf16125 tokens/s 9661 instance per socket
MLPerf Inference v3.1GPT-J (offline, 99.0% acc)Large Language Modelint82.05 samp/s  74 cores per instance
Intel PyTorch 2.1 DeepSpeedLLaMA2-7B Token size 1024/128text-generation, Beam Search, Width=4int8  4711 instance per socket
Intel PyTorch 2.1 DeepSpeedLLaMA2-7B Token size 1024/128text-generation, Beam Search, Width=4int8111.6 tokens/s 107.561 instance per socket
Intel PyTorch 2.1 DeepSpeedLLaMA2-7B Token size 1024/128text-generation, Beam Search, Width=4bf16  6811 instance per socket
Intel PyTorch 2.1 DeepSpeedLLaMA2-7B Token size 1024/128text-generation, Beam Search, Width=4bf16109.1 tokens/s 11061 instance per socket
MLPerf Inference v3.1ResNet50 v1.5 (offline)Image Recognitionint820,565.5 samp/s  2561 core per instance
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionint810,215.7 img/s9.98 14 cores per instance
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionint813,862.96 img/s14.09 1161 instance per socket
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionbf166,210.69 img/s6.13 14 cores per instance
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionbf167,295.63 img/s7.33 1161 instance per socket
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionfp321,319.52 img/s1.27 14 cores per instance
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionfp321,360.05 img/s1.28 1161 instance per socket
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionbf321,659.37 img/s1.65 14 cores per instance
Intel PyTorch 2.1ResNet50 v1.5Image Recognitionbf321,985.26 img/s2.02 1161 instance per socket
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionint87,440.61 img/s7.70 14 cores per instance
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionint812,345.54 img/s11.80 1161 instance per socket
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionbf165,053.76 img/s5.01 14 cores per instance
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionbf166,704.17 img/s6.34 1161 instance per socket
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionfp321,282.77 img/s1.17 14 cores per instance
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionfp321,342.91 img/s1.27 1161 instance per socket
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionbf321,529.49 img/s1.41 14 cores per instance
Intel TensorFlow 2.14ResNet50 v1.5Image Recognitionbf322,017.54 img/s1.89 1161 instance per socket
OpenVINO 2023.2ResNet50 v1.5Image Recognitionint88,819.657 img/s8.81 14 cores per instance
OpenVINO 2023.2ResNet50 v1.5Image Recognitionbf165,915.793 img/s5.82 14 cores per instance
OpenVINO 2023.2ResNet50 v1.5Image Recognitionfp321,281.337 img/s1.25 14 cores per instance
MLPerf Inference v3.1BERT-Large (offline, 99.0% acc)Natural Language Processingint81,357.33 samp/s  1,3004 cores per instance
Intel PyTorch 2.1BERTLargeNatural Language Processingint8335.1 sent/s0.35 14 cores per instance
Intel PyTorch 2.1BERTLargeNatural Language Processingint8378.73 sent/s0.36 561 instance per socket
Intel PyTorch 2.1BERTLargeNatural Language Processingbf16204.52 sent/s0.21 14 cores per instance
Intel PyTorch 2.1BERTLargeNatural Language Processingbf16201.44 sent/s0.21 161 instance per socket
Intel PyTorch 2.1BERTLargeNatural Language Processingfp3235.25 sent/s0.03 14 cores per instance
Intel PyTorch 2.1BERTLargeNatural Language Processingfp3241.05 sent/s0.04 561 instance per socket
Intel PyTorch 2.1BERTLargeNatural Language Processingbf3272.42 sent/s0.07 14 cores per instance
Intel PyTorch 2.1BERTLargeNatural Language Processingbf3271.63 sent/s0.07 161 instance per socket
Intel TensorFlow 2.14BERTLargeNatural Language Processingint8253.27 sent/s0.24 14 cores per instance
Intel TensorFlow 2.14BERTLargeNatural Language Processingint8239.89 sent/s0.25 161 instance per socket
Intel TensorFlow 2.14BERTLargeNatural Language Processingbf16181.02 sent/s0.18 14 cores per instance
Intel TensorFlow 2.14BERTLargeNatural Language Processingbf16184.06 sent/s0.17 1281 instance per socket
Intel TensorFlow 2.14BERTLargeNatural Language Processingfp3244.73 sent/s0.04 14 cores per instance
Intel TensorFlow 2.14BERTLargeNatural Language Processingfp3238.58 sent/s0.04 161 instance per socket
Intel TensorFlow 2.14BERTLargeNatural Language Processingbf3272.78 sent/s0.07 14 cores per instance
Intel TensorFlow 2.14BERTLargeNatural Language Processingbf3271.77 sent/s0.07 161 instance per socket
OpenVINO 2023.2BERTLargeNatural Language Processingint8298.44 sent/s0.30 14 cores per instance
OpenVINO 2023.2BERTLargeNatural Language Processingint8285.68 sent/s0.28 481 instance per socket
OpenVINO 2023.2BERTLargeNatural Language Processingbf16202.48 sent/s0.20 14 cores per instance
OpenVINO 2023.2BERTLargeNatural Language Processingbf16191.2533 sent/s0.19 321 instance per socket
OpenVINO 2023.2BERTLargeNatural Language Processingfp3247.33667 sent/s0.05 14 cores per instance
OpenVINO 2023.2BERTLargeNatural Language Processingfp3244.23333 sent/s0.04 481 instance per socket
MLPerf Inference v3.1DLRM-v2 (offline, 99.0% acc)Recommenderint85,367.77 samp/s  3001 core per instance
Intel PyTorch 2.1DLRM Criteo TerabyteRecommenderint823,444,587 rec/s23611.92 1281 instance per socket
Intel PyTorch 2.1DLRM Criteo TerabyteRecommenderbf1610,646,560 rec/s10238.88 1281 instance per socket
Intel PyTorch 2.1DLRM Criteo TerabyteRecommenderfp322,278,228 rec/s2220.37 1281 instance per socket
Intel PyTorch 2.1DLRM Criteo TerabyteRecommenderbf324,530,200 rec/s4427.38 1281 instance per socket
Intel PyTorch 2.1DistilBERTNatural Language Processingint84,726.15 sent/s4.94 14 cores per instance
Intel PyTorch 2.1DistilBERTNatural Language Processingint87,759.25 sent/s8.42 1681 instance per socket
Intel PyTorch 2.1DistilBERTNatural Language Processingbf163,306.46 sent/s3.35 14 cores per instance
Intel PyTorch 2.1DistilBERTNatural Language Processingbf165,057.47 sent/s5.50 1201 instance per socket
Intel PyTorch 2.1DistilBERTNatural Language Processingfp32900.58 sent/s0.85 14 cores per instance
Intel PyTorch 2.1DistilBERTNatural Language Processingfp321,007.05 sent/s1.04 561 instance per socket
Intel PyTorch 2.1DistilBERTNatural Language Processingbf321,513.66 sent/s1.49 14 cores per instance
Intel PyTorch 2.1DistilBERTNatural Language Processingbf321,926.1 sent/s1.77 2881 instance per socket
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationint861.03 sent/s0.06 14 cores per instance
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationint8245.66 sent/s0.24 4481 instance per socket
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationbf1641.44 sent/s0.04 14 cores per instance
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationbf16278.81 sent/s0.28 4481 instance per socket
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationfp3220.27 sent/s0.02 14 cores per instance
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationfp32102.48 sent/s0.10 4481 instance per socket
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationbf3220.28 sent/s0.02 14 cores per instance
Intel TensorFlow 2.14Transformer MLPerfLanguage Translationbf32114.08 sent/s0.11 4481 instance per socket
OpenVINO 2023.23D-UnetImage Segmentationint824.68333 samp/s0.02 14 cores per instance
OpenVINO 2023.23D-UnetImage Segmentationint821.85667 samp/s0.02 61 instance per socket
OpenVINO 2023.23D-UnetImage Segmentationbf1613.05333 samp/s0.01 14 cores per instance
OpenVINO 2023.23D-UnetImage Segmentationbf1611.87 samp/s0.01 61 instance per socket
OpenVINO 2023.23D-UnetImage Segmentationfp322.883333 samp/s0.00 14 cores per instance
OpenVINO 2023.23D-UnetImage Segmentationfp322.62 samp/s0.00 61 instance per socket
OpenVINO 2023.2SSD-ResNet34 COCO 2017 (1200 x1200)Object Detectionint8459.3633 img/s0.44 14 cores per instance
OpenVINO 2023.2SSD-ResNet34 COCO 2017 (1200 x1200)Object Detectionbf16218.4133 img/s0.20 14 cores per instance
OpenVINO 2023.2SSD-ResNet34 COCO 2017 (1200 x1200)Object Detectionfp3231.17333 img/s0.03 14 cores per instance
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationint81289.95 fps1.35 14 cores per instance
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationint81923.77 fps1.83 1161 instance per socket
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationbf16648.58 fps0.66 14 cores per instance
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationbf16867.05 fps0.87 641 instance per socket
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationfp32151.29 fps0.14 14 cores per instance
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationfp32160.93 fps0.15 641 instance per socket
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationbf32215.11 fps0.21 14 cores per instance
Intel PyTorch 2.1ResNeXt101 32x16d ImageNetImage Classificationbf32241.98 fps0.22 1161 instance per socket
MLPerf Inference v3.1RetinaNet (offline)Object Detectionint8284.75 samp/s  24 cores per instance
MLPerf Inference v3.1RNN-T (offline)Speech-to-textint8+bf165,782.18 samp/s  2564 cores per instance