syrk_batch

oneMKL - Data Parallel C++ Developer Reference

Download PDF

ID 772045

Date 11/07/2023

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-675B52D4-CD50-4700-8131-7D56B4CA9E8A

View Details

syrk_batch

Computes a group of syrk operations.

Description

The syrk_batch routines are batched versions of syrk, performing multiple syrk operations in a single call. Each syrk operation performs a rank-k update with general matrices.

syrk_batch supports the following precisions:

T
`float`
`double`
`std::complex<float>`
`std::complex<double>`

syrk_batch (Buffer Version)

Buffer version of syrk_batch supports only strided API.

Strided API

Strided API operation is defined as:

for i = 0 … batch_size – 1
    A and C are matrices at offset i * stridea and i * stridec in a and c.
    C = alpha * op(A) * op(A)^T + beta * C
end for

where:

op(X) is one of op(X) = X, or op(X) = X^T, or op(X) = X^H
alpha and beta are scalars
A is general matrix and C is symmetric matrix
op(A) is n x k and C is n x n

For strided API, a and c buffers contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in a and c buffers is given by batch_size parameter.

Syntax

namespace oneapi::mkl::blas::column_major {
   void syrk_batch(sycl::queue &queue,
                   oneapi::mkl::uplo upper_lower,
                   oneapi::mkl::transpose trans,
                   std::int64_t n,
                   std::int64_t k,
                   T alpha,
                   sycl::buffer<T,1> &a,
                   std::int64_t lda,
                   std::int64_t stridea,
                   T beta,
                   sycl::buffer<T,1> &c,
                   std::int64_t ldc,
                   std::int64_t stridec,
                   std::int64_t batch_size,
                   compute_mode mode = compute_mode::unset)
}

namespace oneapi::mkl::blas::row_major {
   void syrk_batch(sycl::queue &queue,
                   oneapi::mkl::uplo upper_lower,
                   oneapi::mkl::transpose trans,
                   std::int64_t n,
                   std::int64_t k,
                   T alpha,
                   sycl::buffer<T,1> &a,
                   std::int64_t lda,
                   std::int64_t stridea,
                   T beta,
                   sycl::buffer<T,1> &c,
                   std::int64_t ldc,
                   std::int64_t stridec,
                   std::int64_t batch_size,
                   compute_mode mode = compute_mode::unset)
}

Input Parameters

queue

The queue where the routine should be executed.

upper_lower

Specifies whether matrices C are upper or lower triangular. See Data Types for more details.

trans

Specifies op(A), transposition operation applied to matrices A. Conjugation is never performed even if trans = transpose::conjtrans. See Data Types for more details.

n

Number of rows and columns of matrices C. Must be at least zero.

k

Number of columns of matrices op(A). Must be at least zero.

alpha

Scaling factor for rank-k update.

a

Buffer holding input matrices A. Size of the buffer must be at least stridea * batch_size.

lda

Leading dimension of matrices A. Must be positive.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `n`	Must be at least `k`
Row major	Must be at least `k`	Must be at least `n`

stridea

Stride between two consecutive A matrices.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `lda` * `k`	Must be at least `lda` * `n`
Row major	Must be at least `lda` * `n`	Must be at least `lda` * `k`

beta

Scaling factor for matrices C.

c

Buffer holding input/output matrices C. Size of the buffer must be at least stridec * batch_size.

ldc

Leading dimension of matrices C. Must be positive and at least n.

stridec

Stride between two consecutive C matrices. Must be least ldc * n.

batch_size

Specifies the number of matrix multiply operations to perform.

mode

Optional. Compute mode settings. See Compute Modes for more details.

Output Parameters

c: Output buffer overwritten by batch_sizesyrk operations of the form alpha * op(A) * op(A)^T + beta * C.

syrk_batch (USM Version)

USM version of syrk_batch supports group API and strided API.

Group API

Group API operation is defined as:

idx = 0
for i = 0 … group_count – 1
    for j = 0 … group_size – 1
        A, and C are matrices in a[idx] and c[idx]
        C = alpha[i] * op(A) * op(A)^T + beta[i] * C
        idx := idx + 1
    end for
end for

where:

op(X) is one of op(X) = X, or op(X) = X^T, or op(X) = X^H
alpha and beta are scalars
A is general matrix and C is symmetric matrix
op(A) is n x k and C is n x n

For group API, a and c arrays contain the pointers for all the input matrices. The total number of matrices in a and c are given by:

Syntax

namespace oneapi::mkl::blas::column_major {
    sycl::event syrk_batch(sycl::queue &queue,
                           const oneapi::mkl::uplo *upper_lower,
                           const oneapi::mkl::transpose *trans,
                           const std::int64_t *n,
                           const std::int64_t *k,
                           const T *alpha,
                           const T **a,
                           const std::int64_t *lda,
                           const T *beta,
                           T **c,
                           const std::int64_t *ldc,
                           std::int64_t group_count,
                           const std::int64_t *group_size,
                           compute_mode mode = compute_mode::unset,
                           const std::vector<sycl::event> &dependencies = {})
}

namespace oneapi::mkl::blas::row_major {
    sycl::event syrk_batch(sycl::queue &queue,
                           const oneapi::mkl::uplo *upper_lower,
                           const oneapi::mkl::transpose *trans,
                           const std::int64_t *n,
                           const std::int64_t *k,
                           const T *alpha,
                           const T **a,
                           const std::int64_t *lda,
                           const T *beta,
                           T **c,
                           const std::int64_t *ldc,
                           std::int64_t group_count,
                           const std::int64_t *group_size,
                           compute_mode mode = compute_mode::unset,
                           const std::vector<sycl::event> &dependencies = {})
}

Input Parameters

queue

The queue where the routine should be executed.

upper_lower

Array of group_countoneapi::mkl::uplo values. upper_lower[i] specifies whether matrices C are upper or lower triangular in group i. See Data Types for more details.

trans

Array of group_countoneapi::mkl::transpose values. trans[i] specifies op(A), transposition operation applied to matrices A in group i. See Data Types for more details.

n

Array of group_count integers. n[i] specifies number of rows and columns of matrices C in group i. All entries must be at least zero.

k

Array of group_count integers. k[i] specifies number of columns of matrices op(A) in group i. All entries must be at least zero.

alpha

Array of group_count scalar elements. alpha[i] specifies scaling factor for every rank-k update in group i.

a

Array of total_batch_count pointers for input matrices A. See Matrix Storage for more details.

	`trans` = `transpose::nontrans`	`trans` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Size of array `A[i]` must be at least `lda[i]` * `k[i]`	Size of array `A[i]` must be at least `lda[i]` * `n[i]`
Row major	Size of array `A[i]` must be at least `lda[i]` * `n[i]`	Size of array `A[i]` must be at least `lda[i]` * `k[i]`

lda

Array of group_count integers. lda[i] specifies leading dimension of matrices A in group i. Must be positive.

	`trans` = `transpose::nontrans`	`trans` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `n[i]`.	Must be at least `k[i]`.
Row major	Must be at least `k[i]`.	Must be at least `n[i]`.

beta

Array of group_count scalar elements. beta[i] specifies scaling factor for matrices C in group i.

c

Array of total_batch_count pointers for input/output matrices C. Size of array C[i] must be at least ldc[i] * n[i]. See Matrix Storage for more details.

ldc

Array of group_count integers. ldc[i] specifies leading dimension of matrices C in group i. Must be positive.

group_count

Number of groups. Must be at least zero.

group_size

Array of group_count integers. group_size[i] specifies the number of syrk operations in group i. Each element in group_size must be at least zero.

mode

Optional. Compute mode settings. See Compute Modes for more details.

dependencies

Optional. List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.

mode and dependencies may be omitted independently; it is not necessary to specify mode in order to provide dependencies.

Output Parameters

c: Array of pointers to output matrices C overwritten by total_batch_countsyrk operations of the form alpha * op(A) * op(A)^T + beta * C.

Return Values

Output event to wait on to ensure computation is complete.

Strided API

Strided API operation is defined as:

for i = 0 … batch_size – 1
    A and C are matrices at offset i * stridea and i * stridec in a and c.
    C = alpha * op(A) * op(A)^T + beta * C
end for

where:

op(X) is one of op(X) = X, or op(X) = X^T, or op(X) = X^H
alpha and beta are scalars
A is general matrix and C is symmetric matrix
op(A) is n x k and C is n x n

For strided API, a and c arrays contain all the input matrices. The stride between matrices is given by the stride parameters. Total number of matrices in a and c arrays is given by batch_size parameter.

Syntax

namespace oneapi::mkl::blas::column_major {
   sycl::event syrk_batch(sycl::queue &queue,
                          oneapi::mkl::uplo upper_lower,
                          oneapi::mkl::transpose trans,
                          std::int64_t n,
                          std::int64_t k,
                          oneapi::mkl::value_or_pointer<T> alpha,
                          const T *a,
                          std::int64_t lda,
                          std::int64_t stridea,
                          oneapi::mkl::value_or_pointer<T> beta,
                          T *c,
                          std::int64_t ldc,
                          std::int64_t stridec,
                          std::int64_t batch_size,
                          compute_mode mode = compute_mode::unset,
                          const std::vector<sycl::event> &dependencies = {})
}

namespace oneapi::mkl::blas::row_major {
   sycl::event syrk_batch(sycl::queue &queue,
                          oneapi::mkl::uplo upper_lower,
                          oneapi::mkl::transpose trans,
                          std::int64_t n,
                          std::int64_t k,
                          oneapi::mkl::value_or_pointer<T> alpha,
                          const T *a,
                          std::int64_t lda,
                          std::int64_t stridea,
                          oneapi::mkl::value_or_pointer<T> beta,
                          T *c,
                          std::int64_t ldc,
                          std::int64_t stridec,
                          std::int64_t batch_size,
                          compute_mode mode = compute_mode::unset,
                          const std::vector<sycl::event> &dependencies = {})
}

Input Parameters

queue

The queue where the routine should be executed.

upper_lower

Specifies whether matrices C are upper or lower triangular. See Data Types for more details.

trans

Specifies op(A), transposition operation applied to matrices A. Conjugation is never performed even if trans = transpose::conjtrans. See Data Types for more details.

n

Number of rows and columns of matrices C. Must be at least zero.

k

Number of columns of matrices op(A). Must be at least zero.

alpha

Scaling factor for rank-k update. See Scalar Arguments for more information on the value_or_pointer data type.

a

Pointer to input matrices A. Size of the array must be at least stridea * batch_size.

lda

Leading dimension of matrices A. Must be positive.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `n`	Must be at least `k`
Row major	Must be at least `k`	Must be at least `n`

stridea

Stride between two consecutive A matrices.

	`transa` = `transpose::nontrans`	`transa` = `transpose::trans` or `trans` = `transpose::conjtrans`
Column major	Must be at least `lda` * `k`	Must be at least `lda` * `n`
Row major	Must be at least `lda` * `n`	Must be at least `lda` * `k`

beta

Scaling factor for matrices C. See Scalar Arguments for more information on the value_or_pointer data type.

c

Pointer to input/output matrices C. Size of the array must be at least stridec * batch_size.

ldc

Leading dimension of matrices C. Must be positive and at least n.

stridec

Stride between two consecutive C matrices. Must be least ldc * n.

batch_size

Specifies the number of matrix multiply operations to perform.

mode

Optional. Compute mode settings. See Compute Modes for more details.

dependencies

Optional. List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.

mode and dependencies may be omitted independently; it is not necessary to specify mode in order to provide dependencies.

Output Parameters

c: Pointer to output matrices C overwritten by batch_sizesyrk operations of the form alpha * op(A) * op(A)^T + beta * C.

Return Values

Output event to wait on to ensure computation is complete.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

oneMKL - Data Parallel C++ Developer Reference

syrk_batch

Description