Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 10/31/2024
Public
Document Table of Contents

mkl_?omatadd_batch_strided

Computes a group of out-of-place scaled matrix additions using general matrices.

call mkl_somatadd_batch_strided(ordering, transa, transb, rows, cols, alpha, A, lda, stridea, beta, B, ldb, strideb, C, ldc, stridec, batch_size);

call mkl_domatadd_batch_strided(ordering, transa, transb, rows, cols, alpha, A, lda, stridea, beta, B, ldb, strideb, C, ldc, stridec, batch_size);

call mkl_comatadd_batch_strided(ordering, transa, transb, rows, cols, alpha, A, lda, stridea, beta, B, ldb, strideb, C, ldc, stridec, batch_size);

call mkl_zomatadd_batch_strided(ordering, transa, transb, rows, cols, alpha, A, lda, stridea, beta, B, ldb, strideb, C, ldc, stridec, batch_size);

Description

The mkl_omatadd_batch_strided routines perform a series of scaled matrix additions. They are similar to the mkl_omatadd routines, but the mkl_omatadd_batch_strided routines perform matrix operations with a group of matrices.

The matrices A, B, and C are stored at a constant stride from each other in memory, given by the parameters stridea, strideb, and stridec. The operation is defined as:

for i = 0 … batch_size – 1
    A is a matrix at offset i * stridea in the array a
    B is a matrix at offset i * strideb in the array b
    C is a matrix at offset i * stridec in the array c
    C = alpha * op(A) + beta * op(B)
end for

where:

  • op(X) is one of op(X) = X, op(X) = X', op(X) = conjg(X) or op(X) = conjg(X').
  • alpha and beta are scalars.
  • A, B, and C are matrices.

The input arrays a and b contain all the input matrices, and the single output array c contains all the output matrices. The locations of the individual matrices within the array are given by stride lengths, while the number of matrices is given by the batch_size parameter.

In general, the a, b, and c arrays must not overlap in memory, with the exception of the following in-place operations:

  • a and c can point to the same memory if transa is non-transpose and all the A matrices within a have the same parameters as all the respective C matrices within c.

  • b and c can point to the same memory if transb is non-transpose and all the B matrices within b have the same parameters as all the respective C matrices within c.

Input Parameters

layout
CHARACTER* Specifies whether two-dimensional array storage is row-major or column-major.
transa
CHARACTER* Specifies op(A), the transposition operation applied to the matrices A. 'N' or 'n' indicates no operation, 'T' or 't' is transposition, 'R' or 'r' is complex conjugation wtihout tranpsosition, and 'C' or 'c' is conjugate transposition.
transb
CHARACTER* Specifies op(B), the transposition operation applied to the matrices B.
rows
INTEGER Number of rows for the result matrix C. Must be at least zero.
cols
INTEGER Number of columns for the result matrix C. Must be at least zero.
alpha
REAL for mkl_somatadd_batch_strided, *DOUBLE PRECISION* for mkl_domatadd_batch_strided, COMPLEX for mkl_comatadd_batch_strided, *DOUBLE COMPLEX* for mkl_zomatadd_batch_strided. Scaling factor for the matrices A.
a
REAL for mkl_somatadd_batch_strided, *DOUBLE PRECISION* for mkl_domatadd_batch_strided, COMPLEX for mkl_comatadd_batch_strided, *DOUBLE COMPLEX* for mkl_zomatadd_batch_strided. Array holding the input matrices A. If alpha is zero, it is never accessed. Otherwise it must have size at least stride_a*batch_size.
lda
INTEGER Leading dimension of the A matrices. If matrices are stored using column major layout, lda must be at least rows if A is not transposed or cols if A is transposed. If matrices are stored using row major layout, lda must be at least cols if A is not transposed or at least rows if A is transposed. Must be positive.
stride_a
INTEGER Stride between the different A matrices. If matrices are stored using column major layout, stride_a must be at least lda*rows if A is not transposed or at least lda*cols if A is transposed. If matrices are stored using row major layout, stride_a must be at least lda*rows if B is not transposed or at least lda*cols if A is transposed.
beta
REAL for mkl_somatadd_batch_strided, *DOUBLE PRECISION* for mkl_domatadd_batch_strided, COMPLEX for mkl_comatadd_batch_strided, *DOUBLE COMPLEX* for mkl_zomatadd_batch_strided. Scaling factor for the matrices B.
b
REAL for mkl_somatadd_batch_strided, *DOUBLE PRECISION* for mkl_domatadd_batch_strided, COMPLEX for mkl_comatadd_batch_strided, *DOUBLE COMPLEX* for mkl_zomatadd_batch_strided. Array holding the input matrices B. If beta is zero, it is never accessed. Otherwise it must have size at least stride_b*batch_size.
ldb
INTEGER Leading dimension of the B matrices. If matrices are stored using column major layout, ldb must be at least rows if B is not transposed or cols if B is transposed. If matrices are stored using row major layout, ldb must be at least cols if B is not transposed or at least rows if B is transposed. Must be positive.
stride_b
INTEGER Stride between the different B matrices. If matrices are stored using column major layout, stride_b must be at least ldb*cols if B is not transposed or at least ldb*rows if B is transposed. If matrices are stored using row major layout, stride_b must be at least ldb*rows if B is not transposed or at least ldb*cols if B is transposed.
c
REAL for mkl_somatadd_batch_strided, *DOUBLE PRECISION* for mkl_domatadd_batch_strided, COMPLEX for mkl_comatadd_batch_strided, *DOUBLE COMPLEX* for mkl_zomatadd_batch_strided. Output array, overwritten by batch_size matrix addition operations of the form alpha*op(A) + beta*op(B). Must have size at least stride_c*batch_size.
ldc
INTEGER Leading dimension of the A matrices. If matrices are stored using column major layout, lda must be at least rows. If matrices are stored using row major layout, lda must be at least cols. Must be positive.
stride_c
INTEGER Stride between the different C matrices. If matrices are stored using column major layout, stride_c must be at least ldc*cols. If matrices are stored using row major layout, stride_c must be at least ldc*rows.
batch_size
INTEGER Specifies the number of input and output matrices to add.

Output Parameters

c
Array holding the updated matrices C.