Performance Data for Intel® AI Data Center Products

Find the latest AI benchmark performance data for Intel Data Center products, including detailed hardware and software configurations.

Pretrained models, sample scripts, best practices, and tutorials

Measurements were taken using:

6th Gen Intel® Xeon® Scalable Processors

Intel® Xeon® 6980P Processor (128 Cores)

Inference

Framework Version	Model	Usage	Precision	Throughput	Batch size
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	avx_fp32	51.33 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	avx_fp32	49.95 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	avx_fp32	405.71 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	avx_fp32	351.60 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	163.14 token/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	150.00 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	981.52 token/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	686.74 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	99.17 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	93.67 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	787.75 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	587.60 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_fp16	101.69 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_fp16	97.47 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_fp16	964.57 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_fp16	765.42 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf32	51.38 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf32	50.02 tokens/s	1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf32	576.72 tokens/s	30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf32	466.59 tokens/s	30
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	fp32	52.37 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	fp32	51.04 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	fp32	351.10 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	fp32	272.81 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	162.54 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	150.34 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	962.09 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	542.06 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	93.69 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	88.85 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	830.94 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	480.82 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_fp16	100.33 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_fp16	95.44 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_fp16	771.99 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_fp16	551.93 tokens/s	64
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	52.07 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	50.30 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	282.47 tokens/s	17
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	237.34 tokens/s	17
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	158.13 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	146.94 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	765.28 tokens/s	15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	590.43 tokens/s	25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	99.54 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	93.75 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	673.26 tokens/s	29
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	512.50 tokens/s	31
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_fp16	101.03 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_fp16	97.47 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_fp16	687.17 tokens/s	24
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_fp16	559.85 tokens/s	22
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf32	52.20 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf32	50.40 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf32	360.43 tokens/s	25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf32	260.74 tokens/s	17
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	53.78 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	51.81 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	281.41 tokens/s	4
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	136.37 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	168.86 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	152.81 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	746.81 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	480.53 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	96.83 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	92.52 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	657.46 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	441.71 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_fp16	99.86 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_fp16	94.62 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_fp16	558.09 tokens/s	8
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_fp16	347.37 tokens/s	16
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	24.18 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	23.37 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	122.01 tokens/s	10
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	86.45 tokens/s	6
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	82.13 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	75.69 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	436.95 tokens/s	15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	283.91 tokens/s	15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	47.18 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	44.70 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	367.51 tokens/s	30
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	245.02 tokens/s	18
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_fp16	48.45 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_fp16	46.06 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_fp16	380.90 tokens/s	24
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_fp16	288.17 tokens/s	18
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	24.20 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	23.39 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	134.45 tokens/s	10
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	89.23 tokens/s	6
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	23.40 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	22.78 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	132.09 tokens/s	8
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	103.81 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	80.59 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	72.12 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	477.90 tokens/s	8
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	266.37 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	47.34 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	44.76 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	387.62 tokens/s	8
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	231.78 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	47.60 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	44.36 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	395.06 tokens/s	8
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	226.38 tokens/s	16
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	45.23 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	43.55 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	268.14 tokens/s	23
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	201.32 tokens/s	17
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	143.24 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	132.19 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	660.96 tokens/s	15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	430.51 tokens/s	15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	86.96 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	82.07 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	604.93 tokens/s	25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	432.21 tokens/s	25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_fp16	88.89 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_fp16	84.11 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_fp16	638.30 tokens/s	25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_fp16	470.77 tokens/s	22
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	45.26 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	43.51 tokens/s	1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	307.81 tokens/s	23
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	224.92 tokens/s	17
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	45.94 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	44.23 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	262.65 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	190.16 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	138.41 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	126.22 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	703.42 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	481.97 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	84.21 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	79.43 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	618.63 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	424.50 tokens/s	32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	85.19 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	80.02 tokens/s	1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	613.47 tokens/s	16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	439.74 tokens/s	32
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	fp32	0.09 samp/s	1
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	amx_int8	0.25 samp/s	1
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	amx_bf16	0.25 samp/s	1
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	amx_fp16	0.25 samp/s	1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	fp32	121.65 sent/s	1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	fp32	113.96 sent/s	16
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_int8	733.89 sent/s	1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_int8	761.71 sent/s	32
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_bf16	456.85 sent/s	1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_bf16	462.06 sent/s	16
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_fp16	457.07 sent/s	1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_fp16	407.68 sent/s	16
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	avx_fp32	113.44 sent/s	1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	avx_fp32	109.88 sent/s	40
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_int8	829.34 sent/s	1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_int8	1030.82 sent/s	64
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf16	473.94 sent/s	1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf16	554.99 sent/s	32
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_fp16	441.32 sent/s	1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_fp16	460.63 sent/s	88
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf32	212.01 sent/s	1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf32	212.02 sent/s	88
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	fp32	108.62 sent/s	1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	fp32	99.07 sent/s	32
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_int8	484.98 sent/s	1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_int8	569.70 sent/s	16
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf16	403.19 sent/s	1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf16	438.23 sent/s	32
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_fp16	405.00 sent/s	1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_fp16	432.31 sent/s	32
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf32	202.29 sent/s	1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf32	190.89 sent/s	32
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	avx_fp32	844726.49 rec/s	128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_int8	6,676,543.49 rec/s	128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_bf16	4,481,704.53 rec/s	128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_fp16	4,321,739.37 rec/s	128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_bf32	1.588,266.49 rec/s	128
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	avx_fp32	0.12 img/s	1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_int8	0.41 img/s	1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_bf16	0.35 img/s	1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_fp16	0.37 img/s	1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_bf32	0.15 img/s	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	avx_fp32	779.13 fps	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	avx_fp32	807.66 fps	160
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_int8	4490.99 fps	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_int8	6277.16 fps	94
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf16	2624.42 fps	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf16	3570.05 fps	96
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_fp16	2558.35 fps	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_fp16	3442.89 fps	256
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf32	1352.02 fps	1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf32	1572.22 fps	256
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	fp32	744.63 fps	1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	fp32	771.33 fps	252
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_int8	2876.37 fps	1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_int8	4085.75 fps	252
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf16	2332.85 fps	1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf16	3143.31 fps	159
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_fp16	2379.54 fps	1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_fp16	3058.30 fps	159
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf32	1641.15 fps	1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf32	1891.57 fps	239
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	fp32	812.05 fps	1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	fp32	847.73 fps	32
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_int8	3997.38 fps	1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_int8	4198.79 fps	32
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_bf16	2406.63 fps	1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_bf16	2609.63 fps	64
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_fp16	2358.47 fps	1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_fp16	2537.90 fps	64
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	fp32	3776.73 fps	1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	fp32	3800.48 fps	64
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_int8	21,118.56 fps	1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_int8	29,484 fps	64
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_bf16	14,487.85 fps	1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_bf16	17,805.47 fps	32
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_fp16	14,475.74 fps	1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_fp16	17,687.28 fps	32
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	avx_fp32	1.78	1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_int8	6.43	1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_bf16	4.96	1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_fp16	5.1	1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_bf32	2.07	1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	fp32	1.4	1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	amx_int8	3.6	1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	amx_bf16	3.7	1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	amx_fp16	3.58	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	avx_fp32	282.23 fps	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	avx_fp32	283.59 fps	21
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_int8	1403.46 fps	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_int8	1038.72 fps	10
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf16	1058.43 fps	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf16	1011.21 fps	21
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_fp16	994.78 fps	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_fp16	961.03 fps	21
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf32	400.53 fps	1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf32	376.71 fps	21
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	fp32	1415.87 img/s	1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	fp32	1509.53 img/s	94
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf16	2726.87 img/s	1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf16	3986.90 img/s	94
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_fp16	2882.10 img/s	1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_fp16	4199.40 img/s	84
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf32	1587.33 img/s	1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf32	1879.03 img/s	94
OpenVINO 2024.4.0	Yolov-5s	Object Detection	fp32	1570.38 img/s	1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	fp32	1388.07 img/s	16
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_int8	6151.55 img/s	1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_int8	5170.74 img/s	16
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_bf16	4738.79 img/s	1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_bf16	3825.40 img/s	16
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_fp16	4585.38 img/s	1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_fp16	3505.09 img/s	16
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	FP32	15,749.10	1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	FP32	15,927.51	2625
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf16	31,608.73	1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf16	40,945.58	2625
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_fp16	29,505.19	1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_fp16	32,624.98	2625
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf32	16,486.88	1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf32	22,650.06	2625

Hardware and software configuration (measured March 13, 2025):

1-node, 2x Intel® Xeon® 6980P processors, 128 cores, hyperthreading on, turbo on, non-uniform memory access (NUMA) 6.

Integrated accelerators available (used): DLB 8 [0], DSA 8 [0], IAA 8 [0], QAT 8 [0].

Total memory: 1536 GB (24 x 64 GB DDR5 8800 MT/s [8800 MT/s]), BIOS BHSDCRB1.IPC.0033.D57.2406240014, microcode 0x81000290, 1x Ethernet controller I225-LM, 1x 3.5T SSDPF2KX038TZ from Intel, 1x 894.3G Micron_7450_MTFDKBG960TFR, CentOS* Stream 9, 6.6.43. TensorFlow*: 2.19.0, Intel® oneAPI Deep Neural Network Library (oneDNN): e34cb13, PyTorch*: 2.6.0.dev20241124+cpu, Intel® Extension for PyTorch*: 2.6.0+gitc5a2330, oneDNN: v3.6.2, OpenVINO™ toolkit: 2024.4.0, oneDNN: 3.5.0. Test by Intel as of March 13, 2025, 10:45:43 a.m. UTC.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches