Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 6/24/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

ScaLAPACK Array Descriptors

ScaLAPACK uses two-dimensional block-cyclic data distribution as a layout for dense matrix computations. This distribution provides good work balance between available processors, and also allows use of BLAS Level 3 routines for optimal local computations. Information about the data distribution that is required to establish the mapping between each global matrix (array) and its corresponding process and memory location is contained in the array called the array descriptor associated with each global matrix. The size of the array descriptor is denoted as dlen_.

Let A be a two-dimensional block cyclicly distributed matrix with the array descriptor array desca. The meaning of each array descriptor element depends on the type of the matrix A. The tables "Array descriptor for dense matrices" and "Array descriptor for narrow-band and tridiagonal matrices" describe the meaning of each element for the different types of matrices.

Array descriptor for dense matrices (dlen_=9)
Element Name Stored in Description Element Index Number
dtype_a desca(dtype_)

Descriptor type ( =1 for dense matrices).

1
ctxt_a desca(ctxt_) BLACS context handle for the process grid. 2
m_a desca(m_) Number of rows in the global matrix A. 3
n_a desca(n_) Number of columns in the global matrix A. 4
mb_a desca(mb_) Row blocking factor. 5
nb_a desca(nb_) Column blocking factor. 6
rsrc_a desca(rsrc_) Process row over which the first row of the global matrix A is distributed. 7
csrc_a desca(csrc_) Process column over which the first column of the global matrix A is distributed. 8
lld_a desca(lld_) Leading dimension of the local matrix A. 9
Array descriptor for narrow-band and tridiagonal matrices (dlen_=7)
Element Name Stored in Description Element Index Number
dtype_a desca(dtype_) Descriptor type
  • dtype_a=501: 1-by-P grid,
  • dtype_a=502: P-by-1 grid.
1
ctxt_a desca(ctxt_) BLACS context handle indicating the BLACS process grid over which the global matrix A is distributed. The context itself is global, but the handle (the integer value) can vary. 2
n_a desca(n_) The size of the matrix dimension being distributed. 3
nb_a desca(nb_) The blocking factor used to distribute the distributed dimension of the matrix A. 4
src_a desca(src_) The process row or column over which the first row or column of the matrix A is distributed. 5
lld_a desca(lld_) The leading dimension of the local matrix storing the local blocks of the distributed matrix A. The minimum value of lld_a depends on dtype_a.
  • dtype_a=501: lld_a max(size of undistributed dimension, 1),
  • dtype_a=502: lld_a max(nb_a, 1).
6
Not applicable   Reserved for future use. 7

Similar notations are used for different matrices. For example: lld_b is the leading dimension of the local matrix storing the local blocks of the distributed matrix B and dtype_z is the type of the global matrix Z.

The number of rows and columns of a global dense matrix that a particular process in a grid receives after data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the ScaLAPACK tool routine numroc.

After the block-cyclic distribution of global data is done, you may choose to perform an operation on a submatrix sub(A) of the global matrix A defined by the following 6 values (for dense matrices):

m

The number of rows of sub(A)

n

The number of columns of sub(A)

a

A pointer to the local matrix containing the entire global matrix A

ia

The row index of sub(A) in the global matrix A

ja

The column index of sub(A) in the global matrix A

desca

The array descriptor for the global matrix A

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201