Enhance Deep Learning Workloads on the Latest Intel® Xeon® Processors

Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Overview

The 4th generation Intel® Xeon® Scalable processors (formerly code named Sapphire Rapids) offer several built-in features for boosting performance and efficiency of deep learning applications.

This session focuses on one of them—Intel® Advanced Matrix Extensions (Intel® AMX)—and how to take advantage of its AI acceleration power to boost model training and inference using Intel optimizations for PyTorch* and TensorFlow*.

Topics covered include:

An overview of the Intel optimizations, including performance and features on the latest Intel CPUs and how they compare to stock PyTorch and TensorFlow.
How the optimizations reduce a memory footprint and improve performance by automatically mixing precision using bfloat16 or float16 data types.
Using Intel® oneAPI Deep Neural Network Library (oneDNN) with Intel optimizations for PyTorch and TensorFlow to take advantage of other 4th gen Intel Xeon processor built-in acceleration features, such as Intel® Advanced Vector Extensions 512 and Vector Neural Network Instructions (VNNI)
Reducing model inference time with quantization features in Intel® Optimization for PyTorch*
How speedups can be gained over stock PyTorch and TensorFlow on new Amazon Web Services* instances built on Intel Xeon Scalable processors.

Skill level: Novice

Jump to:

Featured Software

The Intel optimizations are available as part of the AI Tools or you can download stand-alone versions: PyTorch Optimization | TensorFlow Optimization.
Get the stand-alone version of oneDNN or as part of the Intel® oneAPI Base Toolkit.