Intel® Deep Learning Boost (Intel® DL Boost)
The second generation of Intel® Xeon® Scalable processors introduced a collection of features for deep learning, packaged together as Intel® DL Boost. These features include Vector Neural Network Instructions (VNNI), which increases throughput for inference applications with support for int8 convolutions by combining multiple machine instructions from previous generations into one machine instruction.
Frameworks and Tools
These frameworks and tools include support for Intel DL Boost on second and third generation Intel Xeon Scalable processors.
TensorFlow*
Intel® Distribution of OpenVINO™ Toolkit
Model Quantization
Most deep learning models are built using 32 bits floating-point precision (FP32). Quantization is the process to represent the model using less memory with minimal accuracy loss. In this context, the main focus is the representation in int8.
Code Sample: New Deep Learning Instruction (bfloat16) Intrinsic Functions
Learn how to use the new Intel® Advanced Vector Extensions 512 with Intel® DL Boost in the third generation of Intel Xeon Scalable processors.
Low-Precision int8 Inference Workflow
Get an explanation of the model quantization steps using the Intel® Distribution of OpenVINO™ toolkit.