Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 6/24/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Interface Consideration

One-Based and Zero-Based Indexing

The Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS Level 2 and Level 3 routines support one-based and zero-based indexing of data arrays.

Routines with typical interfaces support zero-based indexing for the following sparse data storage formats: CSR, CSC, BSR, and COO. Routines with simplified interfaces support zero based indexing for the following sparse data storage formats: CSR, BSR, and COO. See the complete list of Sparse BLAS Level 2 and Level 3 Routines.

The one-based indexing uses the convention of starting array indices at 1. The zero-based indexing uses the convention of starting array indices at 0. For example, indices of the 5-element array x can be presented in case of one-based indexing as follows:

Element index: 1 2 3 4 5

Element value: 1.0 5.0 7.0 8.0 9.0

and in case of zero-based indexing as follows:

Element index: 0 1 2 3 4

Element value: 1.0 5.0 7.0 8.0 9.0

The detailed descriptions of the one-based and zero-based variants of the sparse data storage formats are given in the "Sparse Matrix Storage Formats" in the Appendix "Linear Solvers Basics".

Most parameters of the routines are identical for both one-based and zero-based indexing, but some of them have certain differences. The following table lists all these differences.

Parameter One-based Indexing Zero-based Indexing
val

Array containing non-zero elements of the matrix A, its length is pntre(m) - pntrb(1).

Array containing non-zero elements of the matrix A, its length is pntre(m—1) - pntrb(0).

pntrb

Array of length m. This array contains row indices, such that pntrb(i) - pntrb(1)+1 is the first index of row i in the arrays val and indx

Array of length m. This array contains row indices, such that pntrb(i) - pntrb(0) is the first index of row i in the arrays val and indx.

pntre

Array of length m. This array contains row indices, such that pntre(I) - pntrb(1) is the last index of row i in the arrays val and indx.

Array of length m. This array contains row indices, such that pntre(i) - pntrb(0)-1 is the last index of row i in the arrays val and indx.

ia

Array of length m + 1, containing indices of elements in the array a, such that ia(i) is the index in the array a of the first non-zero element from the row i. The value of the last element ia(m + 1) is equal to the number of non-zeros plus one.

Array of length m+1, containing indices of elements in the array a, such that ia(i) is the index in the array a of the first non-zero element from the row i. The value of the last element ia(m) is equal to the number of non-zeros.

ldb

Specifies the leading dimension of b as declared in the calling (sub)program.

Specifies the second dimension of b as declared in the calling (sub)program.

ldc

Specifies the leading dimension of c as declared in the calling (sub)program.

Specifies the second dimension of c as declared in the calling (sub)program.

Differences Between Intel MKL and NIST* Interfaces

The Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS Level 3 routines have the following conventional interfaces:

mkl_xyyymm(transa, m, n, k, alpha, matdescra, arg(A), b, ldb, beta, c, ldc), for matrix-matrix product;

mkl_xyyysm(transa, m, n, alpha, matdescra, arg(A), b, ldb, c, ldc), for triangular solvers with multiple right-hand sides.

Here x denotes data type, and yyy - sparse matrix data structure (storage format).

The analogous NIST* Sparse BLAS (NSB) library routines have the following interfaces:

xyyymm(transa, m, n, k, alpha, descra, arg(A), b, ldb, beta, c, ldc, work, lwork), for matrix-matrix product;

xyyysm(transa, m, n, unitd, dv, alpha, descra, arg(A), b, ldb, beta, c, ldc, work, lwork), for triangular solvers with multiple right-hand sides.

Some similar arguments are used in both libraries. The argument transa indicates what operation is performed and is slightly different in the NSB library (see Table "Parameter transa"). The arguments m and k are the number of rows and column in the matrix A, respectively, n is the number of columns in the matrix C. The arguments alpha and beta are scalar alpha and beta respectively (betais not used in the Intel® oneAPI Math Kernel Library (oneMKL) triangular solvers.) The argumentsb and c are rectangular arrays with the leading dimension ldb and ldc, respectively. arg(A) denotes the list of arguments that describe the sparse representation of A.

Parameter transa

 

MKL interface

NSB interface

Operation

data type

CHARACTER*1

INTEGER

 

value

N or n

0

op(A) = A

 

T or t

1

op(A) = AT

 

C or c

2

op(A) = AT or op(A) = AH

Parameter matdescra

The parameter matdescra describes the relevant characteristic of the matrix A. This manual describes matdescraas an array of six elements in line with the NIST* implementation. However, only the first four elements of the array are used in the current versions of the Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS routines. Elementsmatdescra(5) and matdescra(6) are reserved for future use. Note that whether matdescrais described in your application as an array of length 6 or 4 is of no importance because the array is declared as a pointer in the Intel® oneAPI Math Kernel Library (oneMKL) routines. To learn more about declaration of thematdescraarray, see the Sparse BLAS examples located in the Intel® oneAPI Math Kernel Library (oneMKL) installation directory:examples/spblasf/ for Fortran . The table below lists elements of the parameter matdescra, their Fortran values, and their meanings. The parameter matdescra corresponds to the argument descra from NSB library.

Possible Values of the Parameter matdescra (descra)

 

MKL interface

NSB interface

Matrix characteristics

 

one-based indexing

zero-based indexing

   

data type

CHARACTER

Char

INTEGER

 

1st element

matdescra(1)

matdescra(0)

descra(1)

matrix structure

value

G

G

0

general

 

S

S

1

symmetric (A = AT)

 

H

H

2

Hermitian (A = (AH))

 

T

T

3

triangular

 

A

A

4

skew(anti)-symmetric (A = -AT)

 

D

D

5

diagonal

2nd element

matdescra(2)

matdescra(1)

descra(2)

upper/lower triangular indicator

value

L

L

1

lower

 

U

U

2

upper

3rd element

matdescra(3)

matdescra(2)

descra(3)

main diagonal type

value

N

N

0

non-unit

 

U

U

1

unit

4th element

matdescra(4)

matdescra(3)

descra(4)

type of indexing

value

F

 

1

one-based indexing
   

C

0

zero-based indexing

In some cases possible element values of the parameter matdescra depend on the values of other elements. The Table "Possible Combinations of Element Values of the Parameter matdescra" lists all possible combinations of element values for both multiplication routines and triangular solvers.

Possible Combinations of Element Values of the Parameter matdescra

Routines

matdescra(1)

matdescra(2)

matdescra(3)

matdescra(4)

Multiplication Routines G ignored ignored F (default) or C
  S or H L (default) N (default) F (default) or C
  S or H L (default) U F (default) or C
  S or H U N (default) F (default) or C
  S or H U U F (default) or C
  A L (default) ignored F (default) or C
  A U ignored F (default) or C
Multiplication Routines and Triangular Solvers T L U F (default) or C
  T L N F (default) or C
  T U U F (default) or C
  T U N F (default) or C
  D ignored N (default) F (default) or C
  D ignored U F (default) or C

For a matrix in the skyline format with the main diagonal declared to be a unit, diagonal elements must be stored in the sparse representation even if they are zero. In all other formats, diagonal elements can be stored (if needed) in the sparse representation if they are not zero.

Operations with Partial Matrices

One of the distinctive feature of the Intel® oneAPI Math Kernel Library (oneMKL) Sparse BLAS routines is a possibility to perform operations only on partial matrices composed of certain parts (triangles and the main diagonal) of the input sparse matrix. It can be done by setting properly first three elements of the parametermatdescra.

An arbitrary sparse matrix A can be decomposed as

A = L + D + U

where L is the strict lower triangle of A, U is the strict upper triangle of A, D is the main diagonal.

Table "Output Matrices for Multiplication Routines" shows correspondence between the output matrices and values of the parameter matdescra for the sparse matrix A for multiplication routines.

Output Matrices for Multiplication Routines

matdescra(1)

matdescra(2)

matdescra(3)

Output Matrix

G

ignored

ignored

alpha*op(A)*x + beta*y

      alpha*op(A)*B + beta*C

S or H

L

N

alpha*op(L+D+L')*x + beta*y

      alpha*op(L+D+L')*B + beta*C

S or H

L

U

alpha*op(L+I+L')*x + beta*y

     

alpha*op(L+I+L')*B + beta*C

S or H

U

N

alpha*op(U'+D+U)*x + beta*y

      alpha*op(U'+D+U)*B + beta*C

S or H

U

U

alpha*op(U'+I+U)*x + beta*y

      alpha*op(U'+I+U)*B + beta*C

T

L

U

alpha*op(L+I)*x + beta*y

      alpha*op(L+I)*B + beta*C

T

L

N

alpha*op(L+D)*x + beta*y

      alpha*op(L+D)*B + beta*C

T

U

U

alpha*op(U+I)*x + beta*y

      alpha*op(U+I)*B + beta*C

T

U

N

alpha*op(U+D)*x + beta*y

      alpha*op(U+D)*B + beta*C

A

L

ignored

alpha*op(L-L')*x + beta*y

      alpha*op(L-L')*B + beta*C

A

U

ignored

alpha*op(U-U')*x + beta*y

      alpha*op(U-U')*B + beta*C

D

ignored

N

alpha*D*x + beta*y

      alpha*D*B + beta*C

D

ignored

U

alpha*x + beta*y

      alpha*B + beta*C

Table "Output Matrices for Triangular Solvers" shows correspondence between the output matrices and values of the parameter matdescra for the sparse matrix A for triangular solvers.

Output Matrices for Triangular Solvers

matdescra(1)

matdescra(2)

matdescra(3)

Output Matrix

T

L

N

alpha*inv(op(L))*x

      alpha*inv(op(L))*B

T

L

U

alpha*inv(op(L))*x

      alpha*inv(op(L))*B

T

U

N

alpha*inv(op(U))*x

      alpha*inv(op(U))*B

T

U

U

alpha*inv(op(U))*x

      alpha*inv(op(U))*B

D

ignored

N

alpha*inv(D)*x

      alpha*inv(D)*B

D

ignored

U

alpha*x

      alpha*B