Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 10/31/2024
Public
Document Table of Contents

p?gesvd

Computes the singular value decomposition of a general matrix, optionally computing the left and/or right singular vectors.

Syntax

call psgesvd(jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt, ivt, jvt, descvt, work, lwork, info)

call pdgesvd(jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt, ivt, jvt, descvt, work, lwork, info)

call pcgesvd(jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt, ivt, jvt, descvt, work, lwork, rwork, info)

call pzgesvd(jobu, jobvt, m, n, a, ia, ja, desca, s, u, iu, ju, descu, vt, ivt, jvt, descvt, work, lwork, rwork, info)

Include Files

Description

The p?gesvd routine computes the singular value decomposition (SVD) of an m-by-n matrix A, optionally computing the left and/or right singular vectors. The SVD is written

A = U*Σ*VT,

where Σ is an m-by-n matrix that is zero except for its min(m, n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of Σ are the singular values of A and the columns of U and V are the corresponding right and left singular vectors, respectively. The singular values are returned in array s in decreasing order and only the first min(m,n) columns of U and rows of vt = VT are computed.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201

NOTE:

The distributed submatrix sub(A) must verify certain alignment properties. These expressions must be true:

  • mb_a = nb_a = nb
  • iroffa = icoffa
where:
  • iroffa = mod(ia-1, nb )
  • icoffa = mod(ja-1, nb )

Input Parameters

mp = number of local rows in A and U

nq = number of local columns in A and VT

size = min(m, n)

sizeq = number of local columns in U

sizep = number of local rows in VT

jobu

(global) CHARACTER*1. Specifies options for computing all or part of the matrix U.

If jobu = 'V', the first size columns of U (the left singular vectors) are returned in the array u;

If jobu ='N', no columns of U (no left singular vectors)are computed.

jobvt

(global) CHARACTER*1.

Specifies options for computing all or part of the matrix VT.

If jobvt = 'V', the first size rows of VT (the right singular vectors) are returned in the array vt;

If jobvt = 'N', no rows of VT(no right singular vectors) are computed.

m

(global) INTEGER. The number of rows of the matrix A(m 0).

n

(global) INTEGER. The number of columns in A(n 0).

a

(local). REAL for psgesvd

DOUBLE PRECISION for pdgesvd

COMPLEX for pcgesvd

COMPLEX*16 for pzgesvd

Block cyclic array, global size (m, n), local size (mp, nq).

ia, ja

(global) INTEGER. The row and column indices in the global matrix A indicating the first row and the first column of the submatrix A, respectively.

desca

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix A.

iu, ju

(global) INTEGER. The row and column indices in the global matrix U indicating the first row and the first column of the submatrix U, respectively.

descu

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix U.

ivt, jvt

(global) INTEGER. The row and column indices in the global matrix VT indicating the first row and the first column of the submatrix VT, respectively.

descvt

(global and local)INTEGER array of size dlen_. The array descriptor for the distributed matrix VT.

work

(local).REAL for psgesvd

DOUBLE PRECISION for pdgesvd

COMPLEX for pcgesvd

COMPLEX*16 for pzgesvd

Workspace array of size lwork

lwork

(local) INTEGER. The size of the array work;

lwork > 2 + 6*sizeb + max(watobd, wbdtosvd),

where sizeb = max(m, n), and watobd and wbdtosvd refer, respectively, to the workspace required to bidiagonalize the matrix A and to go from the bidiagonal matrix to the singular value decomposition USVT.

For watobd, the following holds:

watobd = max(max(wp?lange,wp?gebrd), max(wp?lared2d, wp?lared1d)),

where wp?lange, wp?lared1d, wp?lared2d, wp?gebrd are the workspaces required respectively for the subprograms p?lange, p?lared1d, p?lared2d, p?gebrd. Using the standard notation

mp = numroc(m, mb, MYROW, desca(ctxt_), NPROW),

nq = numroc(n, nb, MYCOL, desca(lld_), NPCOL),

the workspaces required for the above subprograms are

wp?lange = mp,

wp?lared1d = nq0,

wp?lared2d = mp0,

wp?gebrd = nb*(mp + nq + 1) + nq,

where nq0 and mp0 refer, respectively, to the values obtained at MYCOL = 0 and MYROW = 0. In general, the upper limit for the workspace is given by a workspace required on processor (0,0):

watobdnb*(mp0 + nq0 + 1) + nq0.

In case of a homogeneous process grid this upper limit can be used as an estimate of the minimum workspace for every processor.

For wbdtosvd, the following holds:

wbdtosvd = size*(wantu*nru + wantvt*ncvt) + max(w?bdsqr, max(wantu*wp?ormbrqln, wantvt*wp?ormbrprt)),

where

wantu(wantvt) = 1, if left/right singular vectors are wanted, and wantu(wantvt) = 0, otherwise. w?bdsqr, wp?ormbrqln, and wp?ormbrprt refer respectively to the workspace required for the subprograms ?bdsqr, p?ormbr(qln), and p?ormbr(prt), where qln and prt are the values of the arguments vect, side, and trans in the call to p?ormbr. nru is equal to the local number of rows of the matrix U when distributed 1-dimensional "column" of processes. Analogously, ncvt is equal to the local number of columns of the matrix VT when distributed across 1-dimensional "row" of processes. Calling the LAPACK procedure ?bdsqr requires

w?bdsqr = max(1, 2*size + (2*size - 4)* max(wantu, wantvt))

on every processor. Finally,

wp?ormbrqln = max((nb*(nb-1))/2, (sizeq+mp)*nb)+nb*nb,

wp?ormbrprt = max((mb*(mb-1))/2, (sizep+nq)*mb)+mb*mb,

If lwork = -1, then lwork is global input and a workspace query is assumed; the routine only calculates the minimum size for the work array. The required workspace is returned as the first element of work and no error message is issued by pxerbla.

rwork

REAL for pcgesvd

DOUBLE PRECISION for pzgesvd

Workspace array of size 1 + 4*sizeb. Not used for psgesvd and pdgesvd.

Output Parameters

a

On exit, the contents of a are destroyed.

s

(global). REAL for psgesvd and pcgesvd

DOUBLE PRECISION for pdgesvd and pzgesvd

Array of size size.

Contains the singular values of A sorted so that s(i) s(i+1).

u

(local). REAL for psgesvd

DOUBLE PRECISION for pdgesvd

COMPLEX for pcgesvd

COMPLEX*16 for pzgesvd

local size (mp, sizeq), global size (m, size)

If jobu = 'V', u contains the first min(m, n) columns of U.

If jobu = 'N' or 'O', u is not referenced.

vt

(local). REAL for psgesvd

DOUBLE PRECISION for pdgesvd

COMPLEX for pcgesvd

COMPLEX*16 for pzgesvd

local size (sizep, nq), global size (size, n)

If jobvt = 'V', vt contains the first size rows of VTif jobu = 'N', vt is not referenced.

work

On exit, if info = 0, then work(1) returns the required minimal size of lwork.

rwork

On exit, if info = 0, then rwork(1) returns the required size of rwork.

info

(global) INTEGER.

If info = 0, the execution is successful.

If info < 0, If info = -i, the ith parameter had an illegal value.

If info > 0 i, then if ?bdsqr did not converge,

If info = min(m,n) + 1, then p?gesvd has detected heterogeneity by finding that eigenvalues were not identical across the process grid. In this case, the accuracy of the results from p?gesvd cannot be guaranteed.

See Also