Visible to Intel only — GUID: GUID-75FFFBDE-9D2E-45E1-88E9-16B0CA29935C
Visible to Intel only — GUID: GUID-75FFFBDE-9D2E-45E1-88E9-16B0CA29935C
?latrs3
Solves a triangular system of equations with the scale factor set to prevent overflow. This is a BLAS-3 version of ?latrs for solving several right-hand sides simultaneously.
Syntax
call slatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info ) call dlatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info ) call clatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info ) call zlatrs3( uplo, trans, diag, normin, n, nrhs, a, lda, x, ldx, scale, cnorm, work, lwork, info )
Description
The routine solves one of the triangular systems
A*X = s*B, or AT*X = s*B, or AH*X = s*B (for complex flavors)
with scaling to prevent overflow. Here A is an upper or lower triangular matrix, AT denotes the transpose of A, AH denotes the conjugate transpose of A, X and B are n-by-nrhs vectors, and s is an nrhs-element vector of scaling factors. A scaling factor s(j) is usually less than or equal to 1, chosen such that X(:,j) is less than the overflow threshold. If the matrix A is singular (A(j,j) = 0 for some j), then a non-trivial solution to AX = 0 is returned. If the system is so badly scaled that the solution cannot be represented as (1/s(k))X(:,k), then x(:,k) = 0 and s(k) is returned.
Input Parameters
- uplo
-
CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
= 'L': Lower triangular
- trans
-
CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*X = s*B (no transpose)
= 'T': solve AT*X = s*B (transpose)
= 'C': solve AH*X = s*B (conjugate transpose)
- diag
-
CHARACTER*1.
Specifies whether or not the matrix A is unit triangular.
= 'N': non–unit triangular
= 'U': unit triangular
- normin
-
CHARACTER*1.
Specifies whether cnorm has been set or not.
= 'Y': cnorm contains the column norms on entry;
= 'N': cnorm is not set on entry. O
- n
-
INTEGER. The order of the matrix A. n≥ 0
- nrhs
-
INTEGER. The number of columns of X. nrhs >= 0.
- a
-
REAL for slatrs3
DOUBLE PRECISION for dlatrs3
COMPLEX for clatrs3
DOUBLE COMPLEX for zlatrs3.
Array, DIMENSION (lda, n). Contains the triangular matrix A.
If uplo = 'U', the leading n-by-n upper triangular part of the array a contains the upper triangular matrix, and the strictly lower triangular part of A is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of the array a contains the lower triangular matrix, and the strictly upper triangular part of A is not referenced.
If diag = 'U', the diagonal elements of A are also not referenced and are assumed to be 1.
- lda
-
INTEGER. The leading dimension of the array a. lda≥ max(1, n).
- x
-
REAL for slatrs3
DOUBLE PRECISION for dlatrs3
COMPLEX for clatrs3
DOUBLE COMPLEX for zlatrs3.
Array, DIMENSION (ldx, nrhs).
On entry, the right-hand side B of the triangular system.
- ldx
-
INTEGER. The leading dimension of the array x. ldx≥ max(1, n).
- cnorm
-
REAL for slatrs3/clatrs3.
DOUBLE PRECISION for dlatrs3/zlatrs3.
Array, DIMENSION (n).
If normin = 'Y', cnorm is an input argument and cnorm (j) contains the norm of the off-diagonal part of the j-th column of A.
If trans = 'N', cnorm (j) must be greater than or equal to the infinity-norm, and if trans = 'T' or 'C', cnorm(j) must be greater than or equal to the 1-norm.
- work
-
REAL for slatrs3.
DOUBLE PRECISION for dlatrs3.
COMPLEX for clatrs3.
DOUBLE COMPLEX for zlatrs3.
Workspace array; dimension (lwork).
- lwork
-
INTEGER. The size of the work array; lwork ≥ 1.
If lwork = -1, then a workspace query is assumed; the routine calculates only the optimal dimension of the work array and returns this value as the first entry of the work array. No error message related to lwork is issued by xerbla.
Output Parameters
- x
-
On exit, x is overwritten by the solution matrix X.
- scale
-
REAL for slatrs3/clatrs3
DOUBLE PRECISION for dlatrs3/zlatrs3.
Array, DIMENSION (nrhs). The scaling factor s for the triangular system as described above.
The scaling factor s(k) is for the triangular system
A *x(:,k) = s(k)*b(:,k) or AT* x(:,k) = s(k)*b(:,k).
If scale = 0, the matrix A is singular or badly scaled.
If A(j,j) = 0 is encountered, a non-trivial vector x(:,k) that is an exact or approximate solution to A*x(:,k) = 0 is returned. If the system is so badly scaled that the solution cannot be presented as x(:,k) * 1/s(k), then x(:,k) = 0 is returned.
- cnorm
-
If normin = 'N', cnorm is an output argument and cnorm(j) returns the 1-norm of the off-diagonal part of the j-th column of A.
- work
-
On exit, if info = 0, work(1) returns the optimal size of work.
- info
-
INTEGER.
= 0: successful exit
< 0: if info = -k, the k-th argument had an illegal value
Application Notes
The algorithm follows the structure of a block triangular solve. The diagonal block is solved with a call to the robust the triangular solver ?latrs for every right-hand side rhs = 1, ..., nrhs
op(A( j, j )) *x( j, rhs ) = SCALOC* b( j, rhs ),
where op( A ) = A or op( A ) = AT or op( A ) = AH.
The linear block updates operate on block columns of X,
B( i, k ) - op(A( i, j )) * X( j, k )
and use GEMM. To avoid overflow in the linear block update, the worst-case growth is estimated. For every rhs, a scale factor s <= 1.0 is computed such that
|| s * B( i, rhs)||_oo
+ || op(A( i, j )) ||_oo *|| s * X( j, rhs ) ||_oo <= Overflow threshold
Once all columns of a block column have been rescaled (BLAS 1), the linear update is executed with GEMM without overflow.
To limit rescaling, local scale factors track the scaling of column segments. There is one local scale factor s( i, rhs ) per block row i = 1, ..., nba per right-hand side column rhs = 1, ..., nrhs. The global scale factor SCALE( rhs ) is chosen as the smallest local scale factor s( i, rhs )
i= 1, ..., nba.
A triangular solve op(A( j, j )) *x( j, rhs ) = SCALOC* b( j, rhs ) updates the local scale factor s( j, rhs ) := s( j, rhs )*SCALOC. The linear update of potentially inconsistently scaled vector segments
s( i,rhs )* b( I, rhs ) - op(A( i, j ))*( s( j, rhs )* x( j, rhs ) )
computes a consistent scaling SCAMIN = min( s(i, rhs ), s(j, rhs) ) and, if necessary, rescales the blocks prior to calling GEMM.