Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 7/13/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

p?gemr2d

Copies a submatrix from one general rectangular matrix to another.

Syntax

call psgemr2d(m, n, a, ia, ja, desca, b, ib, jb, descb, ictxt)

call pdgemr2d(m, n, a, ia, ja, desca, b, ib, jb, descb, ictxt)

call pcgemr2d(m, n, a, ia, ja, desca, b, ib, jb, descb, ictxt)

call pzgemr2d(m, n, a, ia, ja, desca, b, ib, jb, descb, ictxt)

call pigemr2d(m, n, a, ia, ja, desca, b, ib, jb, descb, ictxt)

Description

The p?gemr2droutine copies the indicated matrix or submatrix of A to the indicated matrix or submatrix of B. It provides a truly general copy from any block cyclicly-distributed matrix or submatrix to any other block cyclicly-distributed matrix or submatrix. With p?trmr2d, these routines are the only ones in the ScaLAPACK library which provide inter-context operations: they can take a matrix or submatrix A in context A (distributed over process grid A) and copy it to a matrix or submatrix B in context B (distributed over process grid B).

There does not need to be a relationship between the two operand matrices or submatrices other than their global size and the fact that they are both legal block cyclicly-distributed matrices or submatrices. This means that they can, for example, be distributed across different process grids, have varying block sizes and differing matrix starting points, or be contained in different sized distributed matrices.

Take care when context A is disjoint from context B. The general rules for which parameters need to be set are:

  • All calling processes must have the correct m and n.

  • Processes in context A must correctly define all parameters describing A.

  • Processes in context B must correctly define all parameters describing B.

  • Processes which are not members of context A must pass ctxt_a = -1 and need not set other parameters describing A.

  • Processes which are not members of contextB must pass ctxt_b = -1 and need not set other parameters describing B.

Because of its generality, p?gemr2d can be used for many operations not usually associated with copy routines. For instance, it can be used to a take a matrix on one process and distribute it across a process grid, or the reverse. If a supercomputer is grouped into a virtual parallel machine with a workstation, for instance, this routine can be used to move the matrix from the workstation to the supercomputer and back. In ScaLAPACK, it is called to copy matrices from a two-dimensional process grid to a one-dimensional process grid. It can be used to redistribute matrices so that distributions providing maximal performance can be used by various component libraries, as well.

Note that this routine requires an array descriptor with dtype_ = 1.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201

Input Parameters

m

(global) INTEGER. The number of rows of matrix A to be copied (m0).

n

(global) INTEGER. The number of columns of matrix A to be copied (n0).

a

(local)

REAL for psgemr2d

DOUBLE for pdgemr2d

COMPLEX for pcgemr2d

DOUBLE COMPLEX for pzgemr2d

INTEGER for pigemr2d.

Pointer into the local memory to array of size lld_aby LOCc(ja+n-1) containing the source matrix A.

ia, ja

(global) INTEGER. The row and column indices in the array A indicating the first row and the first column, respectively, of the submatrix of A) to copy. 1 iatotal_rows_in_a - m +1, 1 jatotal_columns_in_a - n +1.

desca

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix A.

Only dtype_a = 1 is supported, so dlen_ = 9.

If the calling process is not part of the context of A, ctxt_a must be equal to -1.

ib, jb

(global) INTEGER. The row and column indices in the array B indicating the first row and the first column, respectively, of the submatrix B to which to copy the matrix. 1 ibtotal_rows_in_b - m +1, 1 jbtotal_columns_in_b - n +1.

descb

(global and local) INTEGER array of size dlen_. The array descriptor for the distributed matrix B.

Only dtype_b = 1 is supported, so dlen_ = 9.

If the calling process is not part of the context of B, ctxt_b must be equal to -1.

ictxt

(global)INTEGER.

The context encompassing at least the union of all processes in context A and context B. All processes in the context ictxt must call this routine, even if they do not own a piece of either matrix.

Output Parameters

b

REAL for psgemr2d

DOUBLE for pdgemr2d

COMPLEX for pcgemr2d

DOUBLE COMPLEX for pzgemr2d

INTEGER for pigemr2d.

Pointer into the local memory to array of size lld_bbyLOCc(jb+n-1).

Overwritten by the submatrix from A.

See Also