Intel® oneAPI Collective Communications Library

Scalable & Efficient Distributed Training for Deep Neural Networks

Implement Multi-Node Communication Patterns

The Intel® oneAPI Collective Communications Library (oneCCL) enables developers and researchers to more quickly train newer and deeper models. This is done by using optimized communication patterns to distribute model training across multiple nodes.

The library is designed for easy integration into deep learning frameworks, whether you are implementing them from scratch or customizing existing ones.

Built on top of lower-level communication middleware. Message passing interface (MPI) and libfabrics transparently support many interconnects, such as Cornelis Networks*, InfiniBand*, and Ethernet.
Optimized for high performance on Intel CPUs and GPUs.
Allows the tradeoff of compute for communication performance to drive scalability of communication patterns.
Enables efficient implementations of collectives that are heavily used for neural network training, including all-gather, all-reduce, and reduce-scatter.

Download the Stand-Alone Version

A stand-alone download of oneCCL is available. You can download binaries from Intel or choose your preferred repository.

Download

Help oneCCL Evolve

oneCCL is part of the oneAPI industry standards initiative. We welcome you to participate.

Open Source Version (GitHub*)

Download as Part of the Toolkit

oneCCL is included as part of the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.

Get It Now

Features

Common APIs to Support Deep Learning Frameworks

oneCCL exposes a collective API that supports:

Commonly used collective operations found in deep learning and machine learning workloads
Interoperability with SYCL* from the Khronos* Group

Deep Learning Optimizations

The runtime implementation enables several optimizations, including:

Asynchronous progress for compute communication overlap
Dedication of one or more cores to ensure optimal network use
Message prioritization, persistence, and out-of-order execution
Collectives in low-precision data types

Documentation & Code Samples

Documentation

View All Documentation

Code Samples

Learn how to access oneAPI code samples in a tool command line or IDE.

oneCCL Get Started

View All Code Samples (GitHub)

Training

Understanding oneCCL

oneAPI Collective Communications Library [5:07]

Distributed AI Acceleration

Accelerate Distributed AI with a oneCCL Framework [3:24]

Distributed Deep Learning Optimization

Optimize a Deep Learning Recommendation Model by Using PyTorch* with a oneCCL Back End

Efficient Model Training on Multiple CPUs

🗐 View All Resources

🗗 Training & Events Calendar

Specifications

Processors:

Intel® Core™ processor family
Intel® Xeon® processor family
Intel® Xeon® Scalable processor family

GPU:

Intel® Data Center GPU Max Series

Operating system:

Linux*

Languages:

SYCL
C and C++

For more information, see the system requirements.

Compilers:

GNU Compiler Collection (GCC)*
Intel® oneAPI DPC++/C++ Compiler

Distributed environments:

MPI
OFI

Get Help

Your success is our success. Access these forum and GitHub resources when you need assistance.

<link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/commons-page.min.css" type="text/css"><script src="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/commons-page.min.js" defer></script>

<link rel="preload" href="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/atomVideo.min.css" as="style"><link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/atomVideo.min.css" type="text/css"><script src="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/atomVideo.min.js"></script>

<link rel="preload" href="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/colorBlock.min.css" as="style"><link rel="stylesheet" href="/etc.clientlibs/settings/wcm/designs/ver/251216/intel/clientlibs/pages/colorBlock.min.css" type="text/css">

<script>!function(){var e=setInterval(function(){"undefined"!=typeof $CQ&&($CQ(function(){CQ_Analytics.SegmentMgr.loadSegments("/etc/segmentation"),CQ_Analytics.ClientContextUtils.init("/etc/clientcontext/intel",window.location.pathname.substr(0,window.location.pathname.indexOf(".")))}),clearInterval(e))},100)}();</script>