Overview of Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS...

oneMKL - Data Parallel C++ Developer Reference

Download PDF

ID 772045

Date 10/31/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Overview of Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS for DPC++

The following pages describe the oneMKL Sparse BLAS computational routines for DPC++ in detail. These routines, along with other helper routines (see Sparse BLAS Routines for the full list), are declared in the header file oneapi/mkl/spblas.hpp.

Several conventions are used throughout this document:

All oneMKL DPC++ data types and non-domain-specific functions are inside the oneapi::mkl:: namespace.
All oneMKL DPC++ Sparse BLAS functions are inside the oneapi::mkl::sparse namespace.
For brevity, the sycl namespace is omitted from DPC++ object types such as buffers and queues. For example, a single-precision, 1D buffer A would be written buffer<float,1> &A instead of sycl::buffer<float,1> &A.
Computational routines are overloaded on precision. Unless otherwise specified, all oneMKL Sparse BLAS computational routines support float, double, std::complex<float>, and std::complex<double> floating point types, and do not support mixed-precision computations yet.
oneMKL sparse BLAS domain currently does not offer bitwise-reproducibility (BWR) guarantees for most of its APIs.
For sparse matrix row and column indices, oneMKL Sparse BLAS supports std::int32_t and std::int64_t integer types for all supported matrix formats. Matrix handle creation routines are overloaded on integer types.
Some APIs require user-provided temporary workspaces. In case of sycl::buffer APIs, the temporary workspaces are of type sycl::buffer<std::uint8_t, 1> *, whereas in the case of USM APIs, they are of type void *.
For users of USM APIs, usage of oneMKL with all types of allocations (device, shared, and host) are supported; however, performance between them may differ. For maximum performance of Sparse BLAS APIs, we recommend using oneMKL with device memory allocations (sycl::malloc_device()) as much as possible except where specified otherwise, but explicit data movement associated with that is users’ responsibility.

Device Support

DPC++ supports several types of devices:

CPU device: Performs computations on a CPU using OpenCL™.
GPU device: Performs computations on a GPU using OpenCL™ or Level Zero.

Each routine details the device types that are currently supported.

In the current release of oneMKL DPC++ Sparse BLAS, all listed routines support use on CPU and GPU devices with the Compressed Sparse Row (CSR) matrix format unless otherwise noted. Limited support with the Coordinate (COO) matrix format is also available, specified in the documentation of each API.

Routine	Description
Level 2:
sparse::gemv	General sparse matrix-dense vector product
sparse::gemvdot	General sparse matrix-dense vector product with fused dot product
sparse::symv	Symmetric sparse matrix-dense vector product
sparse::trmv	Triangular sparse matrix-dense vector product
sparse::trsv	Triangular solve of sparse matrix against a dense vector.
Level 3:
sparse::gemm	General sparse matrix-dense matrix product with dense matrix output
sparse::trsm	Triangular solve of sparse matrix against a dense matrix.
sparse::omatadd	General sparse matrix-sparse matrix addition with sparse matrix output.
sparse::matmat	General sparse matrix-sparse matrix product with sparse matrix output.
sparse::matmatd	General sparse matrix-sparse matrix product with dense matrix output.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

oneMKL - Data Parallel C++ Developer Reference

Overview of Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS for DPC++

Device Support