Developer Reference for Intel® oneAPI Math Kernel Library for C

ID 766684
Date 3/22/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Convolution and Correlation Data Allocation

This section explains the relation between:

  • mathematical finite functions u, v, w introduced in Mathematical Notation and Definitions;

  • multi-dimensional input and output data vectors representing the functions u, v, w;

  • arrays u, v, w used to store the input and output data vectors in computer memory

The convolution and correlation routine parameters that determine the allocation of input and output data are the following:

  • Data arrays x, y, z

  • Shape arrays xshape, yshape, zshape

  • Strides within arrays xstride, ystride, zstride

  • Parameters start, decimation

Finite Functions and Data Vectors

The finite functions u(p), v(q), and w(r) introduced above are represented as multi-dimensional vectors of input and output data:

inputu(i1,...,idims) for u(p1,...,pN)

inputv(j1,...,jdims) for v(q1,...,qN)

output(k1,...,kdims) for w(r1,...,rN).

Parameter dims represents the number of dimensions and is equal to N.

The parameters xshape, yshape, and zshape define the shapes of input/output vectors:

inputu(i1,...,idims) is defined if 1 inxshape(n) for every n=1,...,dims

inputv(j1,...,jdims) is defined if 1 jnyshape(n) for every n=1,...,dims

output(k1,...,kdims) is defined if 1 knzshape(n) for every n=1,...,dims.

Relation between the input vectors and the functions u and v is defined by the following formulas:

inputu(i1,...,idims)= u(p1,...,pN), where pn = Pnmin + (in-1) for every n

inputv(j1,...,jdims)= v(q1,...,qN), where qn=Qnmin + (jn-1) for every n.

The relation between the output vector and the function w(r) is similar (but only in the case when parameters start and decimation are not defined):

output(k1,...,kdims)= w(r1,...,rN), where rn=Rnmin + (kn-1) for every n.

If the parameter start is defined, it must belong to the interval Rnminstart(n)Rnmax. If defined, the start parameter replaces Rmin in the formula:

output(k1,...,kdims)=w(r1,...,rN), where rn=start(n) + (kn-1)

If the parameter decimation is defined, it changes the relation according to the following formula:

output(k1,...,kdims)=w(r1,...,rN), where rn= Rnmin + (kn-1)*decimation(n)

If both parameters start and decimation are defined, the formula is as follows:

output(k1,...,kdims)=w(r1,...,rN), where rn=start(n) + (kn-1)*decimation(n)

The convolution and correlation software checks the values of zshape, start, and decimation during task commitment. If rn exceeds Rnmax for some kn,n=1,...,dims, an error is raised.

Allocation of Data Vectors

Both parameter arrays x and y contain input data vectors in memory, while array z is intended for storing output data vector. To access the memory, the convolution and correlation software uses only pointers to these arrays and ignores the array shapes.

For parameters x, y, and z, you can provide one-dimensional arrays with the requirement that actual length of these arrays be sufficient to store the data vectors.

The allocation of the input and output data inside the arrays x, y, and z is described below assuming that the arrays are one-dimensional. Given multi-dimensional indices i, j, kZN, one-dimensional indices e, f, gZ are defined such that:

inputu(i1,...,idims) is allocated at x(e)

inputv(j1,...,jdims) is allocated at y(f)

output(k1,...,kdims) is allocated at z(g).

The indices e, f, and g are defined as follows:

e = 1 + xstride(n)·dx(n) (the sum is for all n=1,...,dims)

f = 1 + ystride(n)·dy(n) (the sum is for all n=1,...,dims)

g = 1 + zstride(n)·dz(n) (the sum is for all n=1,...,dims)

The distances dx(n), dy(n), and dz(n) depend on the signum of the stride:

dx(n) = in-1 if xstride(n)>0, or dx(n) = in-xshape(n) if xstride(n)<0

dy(n) = jn-1 if ystride(n)>0, or dy(n) = jn-yshape(n) if ystride(n)<0

dz(n) = kn-1 if zstride(n)>0, or dz(n) = kn-zshape(n) if zstride(n)<0

The definitions of indices e, f, and g assume that indexes for arrays x, y, and z are started from unity:

x(e) is defined for e=1,...,length(x)

y(f) is defined for f=1,...,length(y)

z(g) is defined for g=1,...,length(z)

Below is a detailed explanation about how elements of the multi-dimensional output vector are stored in the array z for one-dimensional and two-dimensional cases.

One-dimensional case. If dims=1, then zshape is the number of the output values to be stored in the array z. The actual length of array z may be greater than zshape elements.

If zstride>1, output values are stored with the stride: output(1) is stored to z(1), output(2) is stored to z(1+zstride), and so on. Hence, the actual length of z must be at least 1+zstride*(zshape-1) elements or more.

If zstride<0, it still defines the stride between elements of array z. However, the order of the used elements is the opposite. For the k-th output value, output(k) is stored in z(1+|zstride|*(zshape-k)), where |zstride| is the absolute value of zstride. The actual length of the array z must be at least 1+|zstride|*(zshape - 1) elements.

Two-dimensional case. If dims=2, the output data is a two-dimensional matrix. The value zstride(1) defines the stride inside matrix columns, that is, the stride between the output(k1, k2) and output(k1+1, k2) for every pair of indices k1, k2. On the other hand, zstride(2) defines the stride between columns, that is, the stride between output(k1,k2) and output(k1,k2+1).

If zstride(2) is greater than zshape(1), this causes sparse allocation of columns. If the value of zstride(2) is smaller than zshape(1), this may result in the transposition of the output matrix. For example, if zshape = (2,3), you can define zstride = (3,1) to allocate output values like transposed matrix of the shape 3x2.

Whether zstride assumes this kind of transformations or not, you need to ensure that different elements output (k1, ...,kdims) will be stored in different locations z(g).