Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

p?gels

Solves overdetermined or underdetermined linear systems involving a matrix of full rank.

Syntax

call psgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pdgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pcgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

call pzgels(trans, m, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, work, lwork, info)

Include Files

Description

The p?gels routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n matrix sub(A) = A(ia:ia+m-1,ja:ja+n-1), or its transpose/ conjugate-transpose, using a QTQ or LQ factorization of sub(A). It is assumed that sub(A) has full rank.

The following options are provided:

  1. If trans = 'N' and mn: find the least squares solution of an overdetermined system, that is, solve the least squares problem

    minimize ||sub(B) - sub(A)*X||

  2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system sub(A)*X = sub(B).

  3. If trans = 'T' and mn: find the minimum norm solution of an undetermined system sub(A)T*X = sub(B).

  4. If trans = 'T' and m < n: find the least squares solution of an overdetermined system, that is, solve the least squares problem

    minimize ||sub(B) - sub(A)T*X||,

    where sub(B) denotes B(ib:ib+m-1, jb:jb+nrhs-1) when trans = 'N' and B(ib:ib+n-1, jb:jb+nrhs-1) otherwise. Several right hand side vectors b and solution vectors x can be handled in a single call; when trans = 'N', the solution vectors are stored as the columns of the n-by-nrhs right hand side matrix sub(B) and the m-by-nrhs right hand side matrix sub(B) otherwise.

Input Parameters

trans

(global) CHARACTER. Must be 'N', or 'T'.

If trans = 'N', the linear system involves matrix sub(A);

If trans = 'T', the linear system involves the transposed matrix AT (for real flavors only).

m

(global) INTEGER. The number of rows in the distributed matrix sub (A) (m 0).

n

(global) INTEGER. The number of columns in the distributed matrix sub (A) (n 0).

nrhs

(global) INTEGER. The number of right-hand sides; the number of columns in the distributed submatrices sub(B) and X. (nrhs 0).

a

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Pointer into the local memory to an array of size (lld_a, LOCc(ja+n-1)). On entry, contains the m-by-n matrix A.

ia, ja

(global) INTEGER. The row and column indices in the global matrix A indicating the first row and the first column of the submatrix A, respectively.

desca

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix A.

b

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Pointer into the local memory to an array of local size (lld_b, LOCc(jb+nrhs-1)). On entry, this array contains the local pieces of the distributed matrix B of right-hand side vectors, stored columnwise; sub(B) is m-by-nrhs if trans='N', and n-by-nrhs otherwise.

ib, jb

(global) INTEGER. The row and column indices in the global matrix B indicating the first row and the first column of the submatrix B, respectively.

descb

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix B.

work

(local)

REAL for psgels

DOUBLE PRECISION for pdgels

COMPLEX for pcgels

DOUBLE COMPLEX for pzgels.

Workspace array with size lwork.

lwork

(local or global) INTEGER.

The size of the array worklwork is local input and must be at least lworkltau + max(lwf, lws), where if m > n, then

ltau = numroc(ja+min(m,n)-1, nb_a, MYCOL, csrc_a, NPCOL),

lwf = nb_a*(mpa0 + nqa0 + nb_a)

lws = max((nb_a*(nb_a-1))/2, (nrhsqb0 + mpb0)*nb_a) + nb_a*nb_a

else

ltau = numroc(ia+min(m,n)-1, mb_a, MYROW, rsrc_a, NPROW),

lwf = mb_a * (mpa0 + nqa0 + mb_a)

lws = max((mb_a*(mb_a-1))/2, (npb0 + max(nqa0 + numroc(numroc(n+iroffb, mb_a, 0, 0, NPROW), mb_a, 0, 0, lcmp), nrhsqb0))*mb_a) + mb_a*mb_a

end if,

where lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),

iroffa = mod(ia-1, mb_a),

icoffa = mod(ja-1, nb_a),

iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),

iacol= indxg2p(ja, nb_a, MYROW, rsrc_a, NPROW)

mpa0 = numroc(m+iroffa, mb_a, MYROW, iarow, NPROW),

nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL),

iroffb = mod(ib-1, mb_b),

icoffb = mod(jb-1, nb_b),

ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),

ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),

mpb0 = numroc(m+iroffb, mb_b, MYROW, icrow, NPROW),

nqb0 = numroc(n+icoffb, nb_b, MYCOL, ibcol, NPCOL),

NOTE:

mod(x,y) is the integer remainder of x/y.

ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW, and NPCOL can be determined by calling the subroutine blacs_gridinfo.

If lwork = -1, then lwork is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by pxerbla.

Output Parameters

a

On exit, If mn, sub(A) is overwritten by the details of its QR factorization as returned by p?geqrf; if m < n, sub(A) is overwritten by details of its LQ factorization as returned by p?gelqf.

b

On exit, sub(B) is overwritten by the solution vectors, stored columnwise: if trans = 'N' and mn, rows 1 to n of sub(B) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements n+1 to m in that column;

If trans = 'N' and m < n, rows 1 to n of sub(B) contain the minimum norm solution vectors;

If trans = 'T' and mn, rows 1 to m of sub(B) contain the minimum norm solution vectors; if trans = 'T' and m < n, rows 1 to m of sub(B) contain the least squares solution vectors; the residual sum of squares for the solution in each column is given by the sum of squares of elements m+1 to n in that column.

work(1)

On exit, work(1) contains the minimum value of lwork required for optimum performance.

info

(global) INTEGER.

= 0: the execution is successful.

< 0: if the i-th argument is an array and the j-entry had an illegal value, then info = - (i* 100+j), if the i-th argument is a scalar and had an illegal value, then info = -i.

See Also