Visible to Intel only — GUID: GUID-721F39A3-AE07-438D-B342-1A54FD37EAED
Visible to Intel only — GUID: GUID-721F39A3-AE07-438D-B342-1A54FD37EAED
imatcopy_batch
Computes a group of in-place scaled matrix transpose or copy operations using general matrices.
Description
The imatcopy_batch routines perform a series of in-place scaled matrix copies or transpositions. They are similar to the imatcopy routines, but the imatcopy_batch routines perform their operations with groups of matrices. The groups contain matrices with the same parameters.
imatcopy_batch supports the following precisions:
T |
---|
float |
double |
std::complex<float> |
std::complex<double> |
imatcopy_batch (Buffer Version)
Buffer version of imatcopy_batch supports only strided API.
Strided API
The operation for the strided API is defined as:
for i = 0 … batch_size – 1
AB is a matrix at offset i * stride in ab
AB = alpha * op(AB)
end for
where:
op(X) is one of op(X) = X, op(X) = X', or op(X) = conjg(X')
alpha is a scalar
AB is a matrix to be transformed in place
For the strided API, the single buffer AB contains all the matrices to be transformed in place. The locations of the individual matrices within the buffer are given by stride lengths, while the number of matrices is given by the batch_size parameter.
Syntax
namespace oneapi::mkl::blas::column_major {
void imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T, 1> &ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size);
}
namespace oneapi::mkl::blas::row_major {
void imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
T alpha,
sycl::buffer<T, 1> &ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size);
}
Input Parameters
- queue
-
The queue where the routine should be executed.
- trans
-
Specifies op(AB), the transposition operation applied to the matrices AB.
- m
-
Number of rows for each matrix AB on input. Must be at least 0.
- n
-
Number of columns for each matrix AB on input. Must be at least 0.
- alpha
-
Scaling factor for the matrix transpose or copy operation.
- ab
-
Buffer holding the matrices AB. Must have size at least stride*batch_size.
- lda
-
Leading dimension of the AB matrices on input. If matrices are stored using column major layout, lda must be at least m. If matrices are stored using row major layout, lda must be at least n. Must be positive.
- ldb
-
Leading dimension of the matrices AB on output. Must be positive.
trans = transpose::nontrans
trans = transpose::trans or trans = transpose::conjtrans
Column major
Must be at least m
Must be at least n
Row major
Must be at least n
Must be at least m
- stride
-
Stride between the different AB matrices. It must be at least max(ldb,lda)*max(ka, kb), where:
ka is m if column major layout is used or n if row major layout is used
kb is n if column major layout is used and AB is not transposed, or m otherwise
- batch_size
-
Specifies the number of matrices to transpose or copy. Must be at least zero.
Output Parameters
- ab
-
Output buffer, overwritten by batch_size matrix multiply operations of the form alpha*op(AB).
imatcopy_batch (USM Version)
USM version of imatcopy_batch supports group API and strided API.
Group API
The type Ti of integer pointers in the group API may be either std::int64_t or std::int32_t.
The operation for the group API is defined as:
idx = 0
for i = 0 … group_count – 1
m,n, alpha, lda, ldb and group_size at position i in their respective arrays
for j = 0 … group_size – 1
AB is a matrix at position idx in AB
AB = alpha * op(AB)
idx := idx + 1
end for
end for
where:
op(X) is one of op(X) = X, op(X) = X', or op(X) = conjg(X')
alpha is a scalar
AB is a matrix to be transformed in place
For the group API, the matrices are given by arrays of pointers. AB represents a matrix stored at the address pointed to by ab. The total number of entries in ab is given by:
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
const transpose *trans,
const Ti *m,
const Ti *n,
const T *alpha, T **ab,
const Ti *lda,
const Ti *ldb,
std::int64_t group_count,
const Ti *groupsize,
const std::vector<sycl::event> &dependencies = {});
}
namespace oneapi::mkl::blas::row_major {
sycl::event imatcopy_batch(sycl::queue &queue,
const transpose *trans,
const Ti *m,
const Ti *n,
const T *alpha, T **ab,
const Ti *lda,
const Ti *ldb,
std::int64_t group_count,
const Ti *groupsize,
const std::vector<sycl::event> &dependencies = {});
}
Input Parameters
- queue
-
The queue where the routine should be executed.
- trans
-
Array of size group_count. Each element i in the array specifies op(AB) the transposition operation applied to the matrices AB.
- m
-
Array of group_count integers. m[i] specifies the number of rows in AB[i] on input. Each entry must be at least zero.
- n
-
Array of group_count integers. n[i] specifies the number of columns in AB[i] on input. Each entry must be at least zero.
- alpha
-
Array of size group_count containing scaling factors for the matrix transpositions or copies.
- ab
-
Array of size total_batch_count, holding pointers to arrays used to store AB matrices.
- lda
-
Array of group_count integers. lda[i] specifies the leading dimension of the matrix input AB. If matrices are stored using column major layout, lda[i] must be at least m[i]. If matrices are stored using row major layout, lda[i] must be at least n[i]. Must be positive.
- ldb
-
Array of group_count integers. ldb[i] specifcies the leading dimension of the matrix AB on output. Each ldb[i] must be positive and satisfy:
trans[i] = transpose::nontrans
trans[i] = transpose::trans or trans[i] = transpose::conjtrans
Column major
Must be at least m[i]
Must be at least n[i]
Row major
Must be at least n[i]
Must be at least m[i]
- group_count
-
Number of groups. Must be at least 0.
- group_size
-
Array of size group_count. The element group_size[i] is the number of matrices in the group i. Each element in group_size must be at least 0.
- dependencies
-
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- ab
-
Output array of pointers to AB matrices, overwritten by total_batch_count matrix transpose or copy operations of the form alpha*op(AB).
Return Values
Output event to wait on to ensure computation is complete.
Strided API
The operation for the strided API is defined as:
for i = 0 … batch_size – 1
AB is a matrix at offset i * stride in ab
AB = alpha * op(AB)
end for
where:
op(X) is one of op(X) = X, op(X) = X', or op(X) = conjg(X')
alpha is a scalar
AB is a matrix to be transformed in place
For the strided API, the single array ab contains all the matrices AB to be transformed in place. The locations of the individual matrices within the array are given by stride lengths, while the number of matrices is given by the batch_size parameter.
Syntax
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
oneapi::mkl::value_or_pointer<T> alpha,
T *ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {});
}
namespace oneapi::mkl::blas::column_major {
sycl::event imatcopy_batch(sycl::queue &queue,
transpose trans,
std::int64_t m,
std::int64_t n,
oneapi::mkl::value_or_pointer<T> alpha,
T *ab,
std::int64_t lda,
std::int64_t ldb,
std::int64_t stride,
std::int64_t batch_size,
const std::vector<sycl::event> &dependencies = {});
}
Input Parameters
- queue
-
The queue where the routine should be executed.
- trans
-
Specifies op(AB), the transposition operation applied to the matrices AB.
- m
-
Number of rows for each matrix AB on input. Must be at least 0.
- n
-
Number of columns for each matrix AB on input. Must be at least 0.
- alpha
-
Scaling factor for the matrix transpose or copy operation. See Scalar Arguments for more information on the value_or_pointer data type.
- ab
-
Array holding the matrices AB. Must have size at least stride*batch_size.
- lda
-
Leading dimension of the AB matrices on input. If matrices are stored using column major layout, lda must be at least m. If matrices are stored using row major layout, lda must be at least n. Must be positive.
- ldb
-
Leading dimension of the matrices AB on output. Must be positive.
trans = transpose::nontrans
trans = transpose::trans or trans = transpose::conjtrans
Column major
Must be at least m
Must be at least n
Row major
Must be at least n
Must be at least m
- stride
-
Stride between the different AB matrices. It must be at least max(ldb,lda)*max(ka, kb), where:
ka is m if column major layout is used or n if row major layout is used
kb is n if column major layout is used and AB is not transposed, or m otherwise
- batch_size
-
Specifies the number of matrices to transpose or copy. Must be at least zero.
- dependencies
-
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- ab
-
Output array, overwritten by batch_size matrix multiply operations of the form alpha*op(AB).
Return Values
Output event to wait on to ensure computation is complete.