Intel® oneAPI Collective Communications Library (oneCCL) Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
Visible to Intel only — GUID: GUID-5137BCE8-EA30-4DA9-84D1-E56CD0AE936D
Enabling OFI/verbs/dmabuf Support
oneCCL provides experimental support for data transfers between Intel GPU memory and NIC using Linux dmabuf, which is exposed through OFI API for verbs provider.
Requirements
Linux kernel version >= 5.12
RDMA core version >= 34.0
level-zero-devel package
Usage
oneCCL, OFI and OFI/verbs from Intel® oneAPI Base Toolkit support device memory transfers. Refer to Run instructions for usage.
If you want to build software components from sources, refer to Build instructions.
Build instructions
OFI
git clone --single-branch --branch v1.13.2 https://github.com/ofiwg/libfabric.git cd libfabric ./autogen.sh ./configure --prefix=<ofi_install_dir> --enable-verbs=<rdma_core_install_dir> --with-ze=<level_zero_install_dir> --enable-ze-dlopen=yes make -j install
oneCCL
cmake -DCMAKE_INSTALL_PREFIX=<ccl_install_dir> -DLIBFABRIC_DIR=<ofi_install_dir> -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DENABLE_OFI_HMEM=1 .. make -j install
Run instructions
Set the environment. See Get Started Guide.
Run allreduce test with ring algorithm and SYCL USM device buffers:
export CCL_ATL_TRANSPORT=ofi export CCL_ATL_HMEM=1 export CCL_ALLREDUCE=ring export FI_PROVIDER=verbs mpiexec -n 2 <ccl_install_dir>/examples/sycl/sycl_allreduce_usm_test gpu device