Model#HPUPrecisionThroughputAccuracyTime To TrainBatchFramework Version
Llama 2 13B16bf1610.16 samples/sec  256DeepSpeed 0.14.0
Llama 2 70B64bf169.13 samples/sec  1024DeepSpeed 0.14.0
Llama 2 70B64FP813.17 samples/sec  1024DeepSpeed 0.14.0
MIXTRAL-8x7B-32K32bf160.7 samples/sec88.46345 min128DeepSpeed 0.14.0
Stable Diffusion64bf1611122 img/sec  32Lightning 2.3.3
Stable Diffusion Fine Tuning**1bf1673 img/sec  7Lightning 2.3.3
Stable Diffusion Fine Tuning Textual Inversion**1bf1619.7 img/sec  7Lightning 2.3.3
ResNet50 LARS32bf1618399 img/sec76.387.26 min256 
ResNet50 LARS8bf1648166.02 img/sec76.0417.81 min256 
ResNet50 LARS1bf166201.14 img/sec  256 
BERT Pre Training Phase 1 (torch.compile)32bf1633179.52 sent/sec 238 min64 
BERT Pre Training Phase 1 (torch.compile)8bf168593.03 sent/sec0 64 
BERT Pre Training Phase 1 (torch.compile)1bf161074.45 sent/sec  64 
BERT Pre Training Phase 2 (torch.compile)32bf169861.81 sent/sec087 min16 
BERT Pre Training Phase 2 (torch.compile)8bf162568.65 sent/sec0 16 
BERT Pre Training Phase 2 (torch.compile)1bf16320.41 sent/sec  16 
BERT SQUAD Fine Tuning8bf162013 sent/sec90.524.68 min24 
ResNext1018bf1621851 img/sec77.81102 min256 
Transformer8bf161121879 token/sec27.9236 min8,192 
Unet2D (torch.compile)8bf1619888 img/sec72.510.21 min64Lightning 2.3.3
Unet3D PTL8bf16252 img/sec74.1717.96 min2Lightning 2.3.3