Framework Version	Model	Usage	Precision	Throughput	Perf/Watt	Latency(ms)	Batch size
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	avx_fp32	22.28 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	avx_fp32	22.28 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	avx_fp32	195.41 tokens/s			30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	avx_fp32	172.08 tokens/s			30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	79.08 token/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	73.17 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	525.55 token/s			30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	352.25 tokens/s			30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	43.24 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	41.56 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	392.6 tokens/s			30
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	280.2 tokens/s			26
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf32	22.88 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf32	22.88 tokens/s			1
Intel PyTorch 2.6.0+IPEX Inf LLMs	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf32	765.42 tokens/s			30
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	fp32	23.03 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	fp32	22.53 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	fp32	168.20 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	fp32	129.32 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	75.20 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	69.49 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_int8	467.34 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_int8	260.83 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	41.60 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	39.77 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 2016/32	Natural Language Processing	amx_bf16	389.80 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	ChatGLM3-6B Token Size 1024/128	Natural Language Processing	amx_bf16	222.83 tokens/s			32
Intel PyTorch 2.1 DeepSpeed	GPT-J 6B Token size 1024/128	text-generation, Beam Search, Width=4	int8			35	1
Intel PyTorch 2.1 DeepSpeed	GPT-J 6B Token size 1024/128	text-generation, Beam Search, Width=4	int8	173 tokens/s		92.5	8
Intel PyTorch 2.1 DeepSpeed	GPT-J 6B Token size 1024/128	text-generation, Beam Search, Width=4	bf16			52.5	1
Intel PyTorch 2.1 DeepSpeed	GPT-J 6B Token size 1024/128	text-generation, Beam Search, Width=4	bf16			98.5	8
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	22.64 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	21.95 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	141.65 tokens/s			21
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	106.92 tokens/s			15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	78.05 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	72.73 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	430.84 tokens/s			25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	304.49 tokens/s			25
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	43.68 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	41.59 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	352.17 tokens/s			27
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	262.67 tokens/s			27
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf32	22.68 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf32	21.84 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf32	172.78 tokens/s			21
Intel PyTorch 2.6.0+ IPEX Inf LLMs	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf32	123.58 tokens/s			15
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	23.20 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	22.34 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	avx_fp32	132.50 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	avx_fp32	90.51 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	77.03 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	71.07 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_int8	446.21 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_int8	230.70 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	43.13 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	41.29 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 2016/32	Natural Language Processing	amx_bf16	379.96 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	GPT-J-6B Token Size 1024/128	Natural Language Processing	amx_bf16	206.26 tokens/s			32
Intel PyTorch 2.2 MLPerf v4.0	GPT-J (MLPerf v4.0, offline, 99.0% acc)	CNN-DailyMail News Text Summarization (input 13,368)	int4	3.61 samp/s			8
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	10.67 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	10.25 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	56.51 tokens/s			10
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	40.28 tokens/s			6
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	38.37 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	35.69 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	224.35 tokens/s			19
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	138.8 tokens/s			18
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	20.49 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	19.56 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	187.43 tokens/s			30
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	124.03 tokens/s			22
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	10.68 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	10.32 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf32	63.42 tokens/s			10
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf32	42.16 tokens/s			6
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	10.60 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	10.27 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	avx_fp32	61.31 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	avx_fp32	46.52 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	35.62 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	32.67 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_int8	219.04 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_int8	100.83 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	20.07 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	19.17 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 2016/32	Natural Language Processing	amx_bf16	181.39 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-13B Token size 1024/128	Natural Language Processing	amx_bf16	87.99 tokens/s			32
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	20.07 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	19.11 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	126.26 tokens/s			19
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	92.14 tokens/s			15
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	70.59 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	64.78 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	327.4 tokens/s			18
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	248.89 tokens/s			21
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	38.9 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	36.67 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	310.58 tokens/s			29
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	220.41 tokens/s			27
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	20.08 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	19.12 tokens/s			1
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf32	144.72 tokens/s			19
Intel PyTorch 2.6.0+ IPEX Inf LLMs	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf32	106.95 tokens/s			15
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	20.09 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	19.47 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	avx_fp32	122.89 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	avx_fp32	85.47 tokens/s			32
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	63.28 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	58.50 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_int8	320.79 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_int8	167.44 tokens/s			64
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	37.70 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	36.06 tokens/s			1
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 2016/32	Natural Language Processing	amx_bf16	276.86 tokens/s			16
OpenVINO 2024.4.0 Inf LLM	LLaMA2-7B Token size 1024/128	Natural Language Processing	amx_bf16	151.64 tokens/s			32
Intel PyTorch 2.1 DeepSpeed	LLaMA2-7B Token size 1024/128	text-generation, Beam Search, Width=4	int8			41.5	1
Intel PyTorch 2.1 DeepSpeed	LLaMA2-7B Token size 1024/128	text-generation, Beam Search, Width=4	int8	149.5 tokens/s		107	8
Intel PyTorch 2.1 DeepSpeed	LLaMA2-7B Token size 1024/128	text-generation, Beam Search, Width=4	bf16			59.5	1
Intel PyTorch 2.1 DeepSpeed	LLaMA2-7B Token size 1024/128	text-generation, Beam Search, Width=4	bf16	142.2 tokens /s		112.5	8
OpenVINO 2023.2	LLaMA2-7b Token size 32/512	GenAI_chat	Int4	11.3 tokens/s		88.44	1
OpenVINO 2023.2	LLaMA2-7b Token size 32/512	GenAI_chat	int8	13.5 tokens/s		73.74	1
OpenVINO 2023.2	LLaMA2-7b Token size 32/512	GenAI_chat	fp32	11.3 tokens/s		88.39	1
OpenVINO 2023.2	LLaMA2-7b Token size 80/512	GenAI_chat	Int4	11.4 tokens/s		87.17	1
OpenVINO 2023.2	LLaMA2-7b Token size 80/512	GenAI_chat	int8	13.6 tokens/s		73.09	1
OpenVINO 2023.2	LLaMA2-7b Token size 80/512	GenAI_chat	fp32	11.2 tokens/s		89	1
OpenVINO 2023.2	LLaMA2-7b Token size 142/512	GenAI_chat	Int4	11.5 tokens/s		86.63	1
OpenVINO 2023.2	LLaMA2-7b Token size 142/512	GenAI_chat	int8	13.3 tokens/s		75.15	1
OpenVINO 2023.2	LLaMA2-7b Token size 142/512	GenAI_chat	fp32	11.1 tokens/s		89.73	1
OpenVINO 2023.2	Stable Diffusion 2.1, 20 Steps, 64 Prompts	GenAI_text_image	int8	0.24 img/s		4,160	1
MLPerf Inference v4.0	Stable Diffusion XL (offline)	Image Generation	bf16	0.19 samp/s			8
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	fp32	0.05 samp/s			1
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	amx_int8	0.12 samp/s			1
OpenVINO 2024.4.0	Stable-Diffusion	Image Generation	amx_bf16	0.13 samp/s			1
MLPerf Inference v4.0	ResNet50 v1.5 (offline)	Image Recognition	int8	25,289.6 samp/s			256
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	Int8	12,862.56 img/s	13.23		1
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	Int8	19,386.47 img/s	19.21		64
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	bf16	8,211.8 img/s	8.13		1
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	bf16	10,187.87 img/s	10.82		64
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	fp32	1,773.68 img/s	1.74		1
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	fp32	1,703.77 img/s	1.57		64
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	bf32	2,431.26 img/s	2.4		1
Intel PyTorch 2.1	ResNet50 v1.5	Image Recognition	bf32	2,686.97 img/s	2.67		64
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	Int8	9,726.18 img/s	9.67		1
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	Int8	16,036.8 img/s	17.01		32
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	bf16	6,782.09 img/s	7.04		1
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	bf16	9,312.72 img/s	9.4		32
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	fp32	1,560.99 img/s	1.45		1
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	fp32	1,663.44 img/s	1.57		32
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	bf32	2,013.88 img/s	1.84		1
Intel TensorFlow 2.14	ResNet50 v1.5	Image Recognition	bf32	2,874.29 img/s	2.73		32
OpenVINO 2023.2	ResNet50 v1.5	Image Recognition	Int8	18,674.37 img/s	26.68		1
OpenVINO 2023.2	ResNet50 v1.5	Image Recognition	bf16	11,537.06 img/s	16.48		1
OpenVINO 2023.2	ResNet50 v1.5	Image Recognition	fp32	1,721.58 img/s	2.46		1
MLPerf Inference v4.0	BERT-Large (offline, 99.0% acc)	Natural Language Processing	int8	1,668.5 samp/s			1,300
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	fp32	55.82 sent/s			1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	fp32	52.53 sent/s			32
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_int8	415.88 sent/s			1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_int8	395.41 sent/s			64
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_bf16	246.91 sent/s			1
OpenVINO 2024.4.0	BERTLarge	Natural Language Processing	amx_bf16	253.7 sent/s			32
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	int8	411.14 sent/s	0.42		1
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	int8	455.33 sent/s	0.45		16
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	bf16	243.89 sent/s	0.24		1
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	bf16	278.00 sent/s	0.25		44
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	fp32	44.56 sent/s	0.04		1
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	fp32	50.49 sent/s	0.05		16
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	bf32	98.49 sent/s	0.09		1
Intel PyTorch 2.1	BERTLarge	Natural Language Processing	bf32	96.98 sent/s	0.09		16
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	avx_fp32	52.79 sent/s			1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	avx_fp32	51.67 sent/s			12
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_int8	431.06 sent/s			1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_int8	539.05 sent/s			44
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf16	240.04 sent/s			1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf16	280.08 sent/s			36
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf32	96.74 sent/s			1
Intel PyTorch 2.6.0 + IPEX	BERT Large	Natural Language Processing	amx_bf32	97.76 sent/s			12
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	fp32	47.92 sent/s			1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	fp32	44.56 sent/s			12
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_int8	266.89 sent/s			1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_int8	200.88 sent/s			10
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf16	200.38 sent/s			1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf16	219.81 sent/s			196
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf32	93.19 sent/s			1
Intel Tensor Flow 2.19.0	BERT Large	Natural Language Processing	amx_bf32	86.61 sent/s			12
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	int8	323.58 sent/s	0.32		1
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	int8	324.56 sent/s	0.33		12
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	bf16	224.04 sent/s	0.22		1
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	bf16	231.37 sent/s	0.23		28
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	fp32	55.34 sent/s	0.05		1
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	fp32	48.46 sent/s	0.05		12
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	bf32	101.93 sent/s	0.1		1
Intel TensorFlow 2.14	BERTLarge	Natural Language Processing	bf32	98.81 sent/s	0.1		12
OpenVINO 2023.2	BERTLarge	Natural Language Processing	int8	373.6867 sent/s	0.37		1
OpenVINO 2023.2	BERTLarge	Natural Language Processing	int8	388.25 sent/s	0.39		32
OpenVINO 2023.2	BERTLarge	Natural Language Processing	bf16	244.25 sent/s	0.24		1
OpenVINO 2023.2	BERTLarge	Natural Language Processing	bf16	281.79 sent/s	0.27		40
OpenVINO 2023.2	BERTLarge	Natural Language Processing	fp32	57.16667 sent/s	0.06		1
OpenVINO 2023.2	BERTLarge	Natural Language Processing	fp32	55.67 sent/s	0.05		16
Intel PyTorch 2.1	DLRM Criteo Terabyte	Recommender	int8	23,444,587 rec/s	23611.92		128
Intel PyTorch 2.1	DLRM Criteo Terabyte	Recommender	bf16	13,223,343 rec/s	12742.32		128
Intel PyTorch 2.1	DLRM Criteo Terabyte	Recommender	fp32	2,742,037 rec/s	2615.42		128
Intel PyTorch 2.1	DLRM Criteo Terabyte	Recommender	bf32	6,760,005 rec/s	6699.18		128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	avx_fp32	386,589 rec/s			128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_int8	3,995,112 rec/s			128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_bf16	2,566,728 rec/s			128
Intel PyTorch 2.6.0 + IPEX	DLRM-v2	Recommender	amx_bf32	718,293 rec/s			128
MLPerf Inference v4.0	DLRM-v2 (offline, 99.9% acc)	Recommender	int8	9,111.08 samp/s			400
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	avx_fp32	0.05 img/s			1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_int8	0.22 img/s			1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_bf16	0.19 img/s			1
Intel PyTorch 2.6.0 + IPEX	Stable-Diffusion	Image Generation	amx_bf32	0.06 img/s			1
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	int8	6,380.26 sent/s	6.8		1
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	int8	10,701.44 sent/s	11.39		104
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	bf16	4,651.69 sent/s	4.97		1
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	bf16	6,864.75 sent/s	7.23		88
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	fp32	1,121.45 sent/s	1.12		1
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	fp32	1,205.86 sent/s	1.27		32
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	bf32	2,161.93 sent/s	2.15		1
Intel PyTorch 2.1	DistilBERT	Natural Language Processing	bf32	2,584.98 sent/s	2.63		56
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	avx_fp32	369.22 fps			1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	avx_fp32	367.75 fps			64
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_int8	2556.2 fps			1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_int8	3632.56 fps			118
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf16	1456.58 fps			1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf16	1936.35 fps			142
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf32	699.98 fps			1
Intel PyTorch 2.6.0 + IPEX	Vision-Transformer	Image Recognition	amx_bf32	767.5 fps			256
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	fp32	336.59 fps			1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	fp32	356.97 fps			38
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_int8	1481.00 fps			1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_int8	2066.89 fps			64
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf16	1234.05 fps			1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf16	1687.83 fps			96
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf32	790.19 fps			1
Intel Tensor Flow 2.19.0	Vision-Transformer	Image Recognition	amx_bf32	1031.97 fps			38
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	fp32	368.75 fps			1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	fp32	372.06 fps			32
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_int8	2193.38 fps			1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_int8	2218.53 fps			64
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_bf16	1335.27 fps			1
OpenVINO 2024.4.0	Vision-Transformer	Image Recognition	amx_bf16	1241.14 fps			32
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	int8	77.94 sent/s	0.07		1
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	int8	334.65 sent/s	0.31		448
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	bf16	52 sent/s	0.05		1
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	bf16	367.07 sent/s	0.35		448
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	fp32	1,099.6 sent/s	26.53		1
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	fp32	137.37 sent/s	0.12		448
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	bf32	24.86 sent/s	0.02		1
Intel TensorFlow 2.14	Transformer MLPerf	Language Translation	bf32	155.04 sent/s	0.14		448
OpenVINO 2023.2	3D-Unet	Image Segmentation	int8	30.31 samples/s	0.03		1
OpenVINO 2023.2	3D-Unet	Image Segmentation	int8	27.18333 samples/s	0.02		6
OpenVINO 2023.2	3D-Unet	Image Segmentation	bf16	15.67667 samples/s	0.01		1
OpenVINO 2023.2	3D-Unet	Image Segmentation	bf16	3.18 samples/s	0		7
OpenVINO 2023.2	3D-Unet	Image Segmentation	fp32	3.49 samples/s	0		1
OpenVINO 2023.2	3D-Unet	Image Segmentation	fp32	14.40 samples/s	0.01		3
OpenVINO 2023.2	SSD-ResNet34 COCO 2017 (1200 x1200)	Object Detection	int8	590.2267 img/s	0.57		1
OpenVINO 2023.2	SSD-ResNet34 COCO 2017 (1200 x1200)	Object Detection	bf16	297.79 img/s	0.28		1
OpenVINO 2023.2	SSD-ResNet34 COCO 2017 (1200 x1200)	Object Detection	fp32	36.92 img/s	0.04		1
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	int8	1,679.87 fps	1.73		1
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	int8	2,481.66 fps	2.56		58
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	bf16	802.44 fps	0.8		1
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	bf16	1,175.18 fps	1.1		72
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	fp32	186.33 fps	0.19		1
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	fp32	202.33 fps	0.19		40
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	bf32	279.07 fps	0.28		1
Intel PyTorch 2.1	ResNeXt101 32x16d ImageNet	Image Classification	bf32	320.62 fps	0.29		58
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	fp32	1773.31 fps			1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	fp32	1639.04 fps			16
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_int8	11951.1 fps			1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_int8	17018.39 fps			64
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_bf16	8324.08 fps			1
OpenVINO 2024.4.0	ResNet50-v1-5	Image Classification	amx_bf16	10659.37 fps			32
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	avx_fp32	0.79			1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_int8	3.28			1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_bf16	2.73			1
Intel PyTorch 2.6.0 + IPEX	LCM	Reasoning and Understanding	amx_bf32	0.92			1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	fp32	0.69			1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	amx_int8	1.84			1
OpenVINO 2024.4.0	LCM	Reasoning and Understanding	amx_bf16	1.85			1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	avx_fp32	124.82 fps			1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	avx_fp32	122.74 fps			4
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_int8	648.05 fps			1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_int8	662.29 fps			16
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf16	553.29 fps			1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf16	518.59 fps			30
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf32	194.71 fps			1
Intel PyTorch 2.6.0 + IPEX	Yolo-v7	Object Detection	amx_bf32	184.2 fps			32
OpenVINO 2023.2	Yolo-v8n	Object Detection	Int8	3,513.54 img/s			1
OpenVINO 2023.2	Yolo-v8n	Object Detection	bf16	3,632.55 img/s			1
OpenVINO 2023.2	Yolov-8n	Object Detection	fp32	1,249.91 img/s			1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	fp32	781.50 img/s			1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	fp32	551.60 img/s			36
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf16	1548.03 img/s			1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf16	1194.27 img/s			36
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf32	880.03 img/s			1
Intel Tensor Flow 2.19.0	Yolo-v5	Object Detection	amx_bf32	625.90 img/s			36
OpenVINO 2024.4.0	Yolov-5s	Object Detection	fp32	728.05 img/s			1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	fp32	629.49 img/s			8
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_int8	3229.96 img/s			1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_int8	2815.91 img/s			8
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_bf16	2216.34 img/s			1
OpenVINO 2024.4.0	Yolov-5s	Object Detection	amx_bf16	2075.86 img/s			16
MLPerf Inference v4.0	RetinaNet (offline)	Object Detection	int8	371.08 samp/s			2
MLPerf Inference v4.0	RNN-T (offline)	Speech-to-text	int8+bf16	8,679.48 samp/s			256
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	FP32	3187.69			1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	FP32	6823.85			649
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf16	5523.51			1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf16	19143.27			649
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf32	3297.23			1
Intel Tensor Flow 2.19.0	R-GAT	Multi-Relational Graphs	amx_bf32	9764.66			649