Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 10/31/2024
Public
Document Table of Contents

?latrs3

Solves a triangular system of equations with the scale factor set to prevent overflow. This is a BLAS-3 version of ?latrs for solving several right-hand sides simultaneously.

Syntax

call slatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info )
call dlatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info )
call clatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info )
call zlatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info )

Description

The routine solves one of the triangular systems

A*X = s*B, or AT*X = s*B, or AH*X = s*B (for complex flavors)

with scaling to prevent overflow. Here A is an upper or lower triangular matrix, AT denotes the transpose of A, AH denotes the conjugate transpose of A, X and B are n-by-nrhs vectors, and s is an nrhs-element vector of scaling factors. A scaling factor s(j) is usually less than or equal to 1, chosen such that X(:,j) is less than the overflow threshold. If the matrix A is singular (A(j,j) = 0 for some j), then a non-trivial solution to AX = 0 is returned. If the system is so badly scaled that the solution cannot be represented as (1/s(k))X(:,k), then x(:,k) = 0 and s(k) is returned.

Input Parameters

uplo

CHARACTER*1.

Specifies whether the matrix A is upper or lower triangular.

= 'U': Upper triangular

= 'L': Lower triangular

trans

CHARACTER*1.

Specifies the operation applied to A.

= 'N': solve A*X = s*B (no transpose)

= 'T': solve AT*X = s*B (transpose)

= 'C': solve AH*X = s*B (conjugate transpose)

diag

CHARACTER*1.

Specifies whether or not the matrix A is unit triangular.

= 'N': non–unit triangular

= 'U': unit triangular

normin

CHARACTER*1.

Specifies whether cnorm has been set or not.

= 'Y': cnorm contains the column norms on entry;

= 'N': cnorm is not set on entry. O

n

INTEGER. The order of the matrix A. n 0

nrhs

INTEGER. The number of columns of X. nrhs >= 0.

a

REAL for slatrs3

DOUBLE PRECISION for dlatrs3

COMPLEX for clatrs3

DOUBLE COMPLEX for zlatrs3.

Array, DIMENSION (lda, n). Contains the triangular matrix A.

If uplo = 'U', the leading n-by-n upper triangular part of the array a contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced.

If uplo = 'L', the leading n-by-n lower triangular part of the array a contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced.

If diag = 'U', the diagonal elements of A are also not referenced and are assumed to be 1.

lda

INTEGER. The leading dimension of the array a. lda max(1, n).

x

REAL for slatrs3

DOUBLE PRECISION for dlatrs3

COMPLEX for clatrs3

DOUBLE COMPLEX for zlatrs3.

Array, DIMENSION (ldx, nrhs).

On entry, the right-hand side B of the triangular system.

ldx

INTEGER. The leading dimension of the array x. ldx max(1, n).

cnorm

REAL for slatrs3/clatrs3.

DOUBLE PRECISION for dlatrs3/zlatrs3.

Array, DIMENSION (n).

If normin = 'Y', cnorm is an input argument and cnorm (j) contains the norm of the off-diagonal part of the j-th column of A.

If trans = 'N', cnorm (j) must be greater than or equal to the infinity-norm, and if trans = 'T' or 'C', cnorm(j) must be greater than or equal to the 1-norm.

work

REAL for slatrs3.

DOUBLE PRECISION for dlatrs3.

COMPLEX for clatrs3.

DOUBLE COMPLEX for zlatrs3.

Workspace array; dimension (lwork).

lwork

INTEGER. The size of the work array; lwork ≥ 1.

If lwork = -1, then a workspace query is assumed; the routine calculates only the optimal dimension of the work array and returns this value as the first entry of the work array. No error message related to lwork is issued by xerbla.

Output Parameters

x

On exit, x is overwritten by the solution matrix X.

scale

REAL for slatrs3/clatrs3

DOUBLE PRECISION for dlatrs3/zlatrs3.

Array, DIMENSION (nrhs). The scaling factor s for the triangular system as described above.

The scaling factor s(k) is for the triangular system

A *x(:,k) = s(k)*b(:,k) or AT* x(:,k) = s(k)*b(:,k).

If scale = 0, the matrix A is singular or badly scaled.

If A(j,j) = 0 is encountered, a non-trivial vector x(:,k) that is an exact or approximate solution to A*x(:,k) = 0 is returned. If the system is so badly scaled that the solution cannot be presented as x(:,k) * 1/s(k), then x(:,k) = 0 is returned.

cnorm

If normin = 'N', cnorm is an output argument and cnorm(j) returns the 1-norm of the off-diagonal part of the j-th column of A.

work

On exit, if info = 0, work(1) returns the optimal size of work.

info

INTEGER.

= 0: successful exit

< 0: if info = -k, the k-th argument had an illegal value

Application Notes

The algorithm follows the structure of a block triangular solve. The diagonal block is solved with a call to the robust the triangular solver ?latrs for every right-hand side rhs = 1, ..., nrhs

op(A( j, j )) *x( j, rhs ) = SCALOC* b( j, rhs ),

where op( A ) = A or op( A ) = AT or op( A ) = AH.

The linear block updates operate on block columns of X,

B( i, k ) - op(A( i, j )) * X( j, k )

and use GEMM. To avoid overflow in the linear block update, the worst-case growth is estimated. For every rhs, a scale factor s <= 1.0 is computed such that

|| s * B( i, rhs)||_oo

+ || op(A( i, j )) ||_oo *|| s * X( j, rhs ) ||_oo <= Overflow threshold

Once all columns of a block column have been rescaled (BLAS 1), the linear update is executed with GEMM without overflow.

To limit rescaling, local scale factors track the scaling of column segments. There is one local scale factor s( i, rhs ) per block row i = 1, ..., nba per right-hand side column rhs = 1, ..., nrhs. The global scale factor SCALE( rhs ) is chosen as the smallest local scale factor s( i, rhs )

i= 1, ..., nba.

A triangular solve op(A( j, j )) *x( j, rhs ) = SCALOC* b( j, rhs ) updates the local scale factor s( j, rhs ) := s( j, rhs )*SCALOC. The linear update of potentially inconsistently scaled vector segments

s( i,rhs )* b( I, rhs ) - op(A( i, j ))*( s( j, rhs )* x( j, rhs ) )

computes a consistent scaling SCAMIN = min( s(i, rhs ), s(j, rhs) ) and, if necessary, rescales the blocks prior to calling GEMM.