Visible to Intel only — GUID: GUID-8B22E47A-8D1F-4D70-80E6-F97A1B97716A
Visible to Intel only — GUID: GUID-8B22E47A-8D1F-4D70-80E6-F97A1B97716A
gemm
Computes a matrix-matrix product with general matrices.
Description
The gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product, with general matrices. The operation is defined as:
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH
alpha and beta are scalars
A, B and C are matrices
op(A) is m x k matrix
op(B) is k x n matrix
C is m x n matrix
gemm supports the following precisions:
Ta |
Tb |
Tc |
Ts |
---|---|---|---|
sycl::half |
sycl::half |
sycl::half |
sycl::half |
sycl::half |
sycl::half |
float |
float |
oneapi::mkl::bfloat16 |
oneapi::mkl::bfloat16 |
oneapi::mkl::bfloat16 |
float |
oneapi::mkl::bfloat16 |
oneapi::mkl::bfloat16 |
float |
float |
std::int8_t |
std::int8_t |
std::int32_t |
float |
std::int8_t |
std::int8_t |
float |
float |
float |
float |
float |
float |
double |
double |
double |
double |
std::complex<float> |
std::complex<float> |
std::complex<float> |
std::complex<float> |
std::complex<double> |
std::complex<double> |
std::complex<double> |
std::complex<double> |
gemm (Buffer Version)
Syntax
namespace oneapi::mkl::blas::column_major {
void gemm(sycl::queue &queue,
oneapi::mkl::transpose transa,
oneapi::mkl::transpose transb,
std::int64_t m,
std::int64_t n,
std::int64_t k,
Ts alpha,
sycl::buffer<Ta,1> &a,
std::int64_t lda,
sycl::buffer<Tb,1> &b,
std::int64_t ldb,
Ts beta,
sycl::buffer<Tc,1> &c,
std::int64_t ldc,
compute_mode mode = compute_mode::unset)
}
namespace oneapi::mkl::blas::row_major {
void gemm(sycl::queue &queue,
oneapi::mkl::transpose transa,
oneapi::mkl::transpose transb,
std::int64_t m,
std::int64_t n,
std::int64_t k,
Ts alpha,
sycl::buffer<Ta,1> &a,
std::int64_t lda,
sycl::buffer<Tb,1> &b,
std::int64_t ldb,
Ts beta,
sycl::buffer<Tc,1> &c,
std::int64_t ldc,
compute_mode mode = compute_mode::unset)
}
Input Parameters
- queue
-
The queue where the routine should be executed.
- transa
-
Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.
- transb
-
Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.
- m
-
Number of rows of matrix op(A) and matrix C. Must be at least zero.
- n
-
Number of columns of matrix op(B) and matrix C. Must be at least zero.
- k
-
Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.
- alpha
-
Scaling factor for matrix-matrix product.
- a
-
Buffer holding input matrix A. See Matrix Storage for more details.
transa = transpose::nontrans
transa = transpose::trans or trans = transpose::conjtrans
Column major
A is m x k matrix. Size of array a must be at least lda * k
A is k x m matrix. Size of array a must be at least lda * m
Row major
A is m x k matrix. Size of array a must be at least lda * m
A is k x m matrix. Size of array a must be at least lda * k
- lda
-
Leading dimension of matrix A. Must be positive.
transa = transpose::nontrans
transa = transpose::trans or trans = transpose::conjtrans
Column major
Must be at least m
Must be at least k
Row major
Must be at least k
Must be at least m
- b
-
Buffer holding input matrix B. See Matrix Storage for more details.
transb = transpose::nontrans
transb = transpose::trans or trans = transpose::conjtrans
Column major
B is k x n matrix. Size of array b must be at least ldb * n
B is n x k matrix. Size of array b must be at least ldb * k
Row major
B is k x n matrix. Size of array b must be at least ldb * k
B is n x k matrix. Size of array b must be at least ldb * n
- ldb
-
Leading dimension of matrix B. Must be positive.
transb = transpose::nontrans
transb = transpose::trans or trans = transpose::conjtrans
Column major
Must be at least k
Must be at least n
Row major
Must be at least n
Must be at least k
- beta
-
Scaling factor for matrix C.
- c
-
Buffer holding input/output matrix C. See Matrix Storage for more details.
Column major
C is m x n matrix. Size of array c must be at least ldc * n
Row major
C is m x n matrix. Size of array c must be at least ldc * m
- ldc
-
Leading dimension of matrix C. Must be positive.
Column major
Must be at least m
Row major
Must be at least n
- mode
-
Optional. Compute mode settings. See Compute Modes for more details.
Output Parameters
- c
-
Output buffer overwritten by alpha * op(A)*op(B) + beta * C.
Examples
An example of how to use buffer version of gemm can be found in oneMKL installation directory, under:
share/doc/mkl/examples/sycl/blas/source/gemm.cpp
gemm (USM Version)
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event gemm(sycl::queue &queue,
oneapi::mkl::transpose transa,
oneapi::mkl::transpose transb,
std::int64_t m,
std::int64_t n,
std::int64_t k,
oneapi::mkl::value_or_pointer<Ts> alpha,
const Ta *a,
std::int64_t lda,
const Tb *b,
std::int64_t ldb,
oneapi::mkl::value_or_pointer<Ts> beta,
Tc *c,
std::int64_t ldc,
compute_mode mode = compute_mode::unset,
const std::vector<sycl::event> &dependencies = {})
}
namespace oneapi::mkl::blas::row_major {
sycl::event gemm(sycl::queue &queue,
oneapi::mkl::transpose transa,
oneapi::mkl::transpose transb,
std::int64_t m,
std::int64_t n,
std::int64_t k,
oneapi::mkl::value_or_pointer<Ts> alpha,
const Ta *a,
std::int64_t lda,
const Tb *b,
std::int64_t ldb,
oneapi::mkl::value_or_pointer<Ts> beta,
Tc *c,
std::int64_t ldc,
compute_mode mode = compute_mode::unset,
const std::vector<sycl::event> &dependencies = {})
}
Input Parameters
- queue
-
The queue where the routine should be executed.
- transa
-
Specifies op(A), the transposition operation applied to matrix A. See Data Types for more details.
- transb
-
Specifies op(B), the transposition operation applied to matrix B. See Data Types for more details.
- m
-
Number of rows of matrix op(A) and matrix C. Must be at least zero.
- n
-
Number of columns of matrix op(B) and matrix C. Must be at least zero.
- k
-
Number of columns of matrix op(A) and rows of matrix op(B). Must be at least zero.
- alpha
-
Scaling factor for matrix-matrix product. See Scalar Arguments for more information on the value_or_pointer data type.
- a
-
Pointer to input matrix A. See Matrix Storage for more details.
A not transposed
A transposed
Column major
A is m x k matrix. Size of array a must be at least lda * k
A is k x m matrix. Size of array a must be at least lda * m
Row major
A is m x k matrix. Size of array a must be at least lda * m
A is k x m matrix. Size of array a must be at least lda * k
- lda
-
Leading dimension of matrix A. Must be positive.
A not transposed
A transposed
Column major
Must be at least m
Must be at least k
Row major
Must be at least k
Must be at least m
- b
-
Pointer to input matrix B. See Matrix Storage for more details.
B not transposed
B transposed
Column major
B is k x n matrix. Size of array b must be at least ldb * n
B is n x k matrix. Size of array b must be at least ldb * k
Row major
B is k x n matrix. Size of array b must be at least ldb * k
B is n x k matrix. Size of array b must be at least ldb * n
- ldb
-
Leading dimension of matrix B. Must be positive.
B not transposed
B transposed
Column major
Must be at least k
Must be at least n
Row major
Must be at least n
Must be at least k
- beta
-
Scaling factor for matrix C. See Scalar Arguments for more information on the value_or_pointer data type.
- c
-
Pointer to input/output matrix C. See Matrix Storage for more details.
Column major
C is m x n matrix. Size of array c must be at least ldc * n
Row major
C is m x n matrix. Size of array c must be at least ldc * m
- ldc
-
Leading dimension of matrix C. Must be positive.
Column major
Must be at least m
Row major
Must be at least n
- mode
-
Optional. Compute mode settings. See Compute Modes for more details.
- dependencies
-
Optional. List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
mode and dependencies may be omitted independently; it is not necessary to specify mode in order to provide dependencies.
Output Parameters
- c
-
Pointer to output matrix overwritten by alpha * op(A)*op(B) + beta * C.
Return Values
Output event to wait on to ensure computation is complete.
Examples
An example of how to use USM version of gemm can be found in oneMKL installation directory, under:
share/doc/mkl/examples/sycl/blas/source/gemm_usm.cpp