Visible to Intel only — GUID: GUID-7B11079E-CB74-4A5F-AEEA-D6C9B7181C42
Visible to Intel only — GUID: GUID-7B11079E-CB74-4A5F-AEEA-D6C9B7181C42
Matrix Arguments
Matrix arguments of the Intel® oneAPI Math Kernel Library routines can be stored ineither one- or two-dimensional arrays, using the following storage schemes:
conventional full storage(in a two-dimensional array)
packed storage for Hermitian, symmetric, or triangular matrices (in a one-dimensional array)
band storage for band matrices (in a two-dimensional array)
rectangular full packed storage for symmetric, Hermitian, or triangular matrices as compact as the Packed storage while maintaining efficiency by using Level 3 BLAS/LAPACK kernels.
Full storage is the simplest scheme. A matrix A is stored in a two-dimensional array a, with the matrix element aij stored in the array element a(i,j). , where lda is the leading dimension of array a.
If a matrix is triangular (upper or lower, as specified by the argument uplo), only the elements of the relevant triangle are stored; the remaining elements of the array need not be set.
Routines that handle symmetric or Hermitian matrices allow for either the upper or lower triangle of the matrix to be stored in the corresponding elements of the array:
- if uplo ='U',
-
aij is stored as described for i ≤ j, other elements of a need not be set.
- if uplo ='L',
-
aij is stored as described for j ≤ i, other elements of a need not be set.
Packed storage allows you to store symmetric, Hermitian, or triangular matrices more compactly: the relevant triangle (again, as specified by the argument uplo) is packed by columns in a one-dimensional array ap:
if uplo ='U', aij is stored in ap(i + j(j - 1)/2) for i ≤ j
if uplo ='L', aij is stored in ap(i + (2*n - j)*(j - 1)/2) for j ≤ i.
In descriptions of LAPACK routines, arrays with packed matrices have names ending in p.
Band storage is as follows: an m-by-n band matrix with kl non-zero sub-diagonals and ku non-zero super-diagonals is stored compactly in a two-dimensional array ab with kl+ku + 1 rows and n columns. Columns of the matrix are stored in the corresponding columns of the array, and diagonals of the matrix are stored in rows of the array . Thus,
aij is stored in ab(ku+1+i-j,j) for max(1,j-ku) ≤ i ≤ min(n,j+kl).
Use the band storage scheme only when kl and ku are much less than the matrix size n. Although the routines work correctly for all values of kl and ku, using the band storage is inefficient if your matrices are not really banded.
The band storage scheme is illustrated by the following example, when
m = n = 6, kl = 2, ku = 1
Array elements marked * are not used by the routines:
When a general band matrix is supplied for LU factorization, space must be allowed to store kl additional super-diagonals generated by fill-in as a result of row interchanges. This means that the matrix is stored according to the above scheme, but with kl + ku super-diagonals. Thus,
aij is stored in ab(kl+ku+1+i-j,j) for max(1,j-ku) ≤ i ≤ min(n,j+kl).
The band storage scheme for LU factorization is illustrated by the following example, whenm = n = 6, kl = 2, ku = 1:
Array elements marked * are not used by the routines; elements marked + need not be set on entry, but are required by the LU factorization routines to store the results. The input array will be overwritten on exit by the details of the LU factorization as follows:
where uij are the elements of the upper triangular matrix U, and mij are the multipliers used during factorization.
Triangular band matrices are stored in the same format, with either kl= 0 if upper triangular, or ku = 0 if lower triangular. For symmetric or Hermitian band matrices with k sub-diagonals or super-diagonals, you need to store only the upper or lower triangle, as specified by the argument uplo:
if uplo ='U', aij is stored in ab(k+1+i-j,j) for max(1,j-k) ≤ i ≤ j
if uplo ='L', aij is stored in ab(1+i-j,j) for j ≤ i ≤ min(n,j+k).
In descriptions of LAPACK routines, arrays that hold matrices in band storage have names ending in b.
In Fortran, column-major ordering of storage is assumed. This means that elements of the same column occupy successive storage locations.
Three quantities are usually associated with a two-dimensional array argument: its leading dimension, which specifies the number of storage locations between elements in the same row, its number of rows, and its number of columns. For a matrix in full storage, the leading dimension of the array must be at least as large as the number of rows in the matrix.
A character transposition parameter is often passed to indicate whether the matrix argument is to be used in normal or transposed form or, for a complex matrix, if the conjugate transpose of the matrix is to be used.
The values of the transposition parameter for these three cases are the following:
- 'N' or 'n'
-
normal (no conjugation, no transposition)
- 'T' or 't'
-
transpose
- 'C' or 'c'
-
conjugate transpose.
Example. Two-Dimensional Complex Array
Suppose A (1:5, 1:4) is the complex two-dimensional array presented by matrix
Let transa be the transposition parameter, m be the number of rows, n be the number of columns, and lda be the leading dimension. Then if
transa = 'N', m = 4, n = 2, and lda = 5, the matrix argument would be
If transa = 'T', m = 4, n = 2, and lda =5, the matrix argument would be
If transa = 'C', m = 4, n = 2, and lda =5, the matrix argument would be
Note that care should be taken when using a leading dimension value which is different from the number of rows specified in the declaration of the two-dimensional array. For example, suppose the array A above is declared as COMPLEX A (5,4).
Then if transa = 'N', m = 3, n = 4, and lda = 4, the matrix argument will be
Rectangular Full Packed storage allows you to store symmetric, Hermitian, or triangular matrices as compact as the Packed storage while maintaining efficiency by using Level 3 BLAS/LAPACK kernels. To store an n-by-n triangle (and suppose for simplicity that n is even), you partition the triangle into three parts: two n/2-by-n/2 triangles and an n/2-by-n/2 square, then pack this as an n-by-n/2 rectangle (or n/2-by-n rectangle), by transposing (or transpose-conjugating) one of the triangles and packing it next to the other triangle. Since the two triangles are stored in full storage, you can use existing efficient routines on them.
There are eight cases of RFP storage representation: when n is even or odd, the packed matrix is transposed or not, the triangular matrix is lower or upper. See below for all the eight storage schemes illustrated:
n is odd, A is lower triangular
n is even, A is lower triangular
n is odd, A is upper triangular
n is even, A is upper triangular
Intel® oneAPI Math Kernel Library (oneMKL) provides a number of routines such as?hfrk, ?sfrk performing BLAS operations working directly on RFP matrices, as well as some conversion routines, for instance, ?tpttf goes from the standard packed format to RFP and ?trttf goes from the full format to RFP.
Please refer to the Netlib site for more information.
Note that in the descriptions of LAPACK routines, arrays with RFP matrices have names ending in fp.