Developer Reference for Intel® oneAPI Math Kernel Library for Fortran

ID 766686
Date 10/31/2024
Public
Document Table of Contents

Convolution and Correlation Usage Examples

This section demonstrates how you can use the Intel® oneAPI Math Kernel Library (oneMKL) routines to perform some common convolution and correlation operations both for single-threaded and multithreaded calculations. The following two sample functionsscond1 and sconf1 simulate the convolution and correlation functions SCOND and SCONF found in IBM ESSL* library. The functions assume single-threaded calculations and can be used with C or C++ compilers.

Function scond1 for Single-Threaded Calculations

#include "mkl_vsl.h"
 
   
int scond1(
    float h[], int inch,
    float x[], int incx,
    float y[], int incy,
    int nh, int nx, int iy0, int ny)
{
    int status;
    VSLConvTaskPtr task;
    vslsConvNewTask1D(&task,VSL_CONV_MODE_DIRECT,nh,nx,ny);
    vslConvSetStart(task, &iy0);
    status = vslsConvExec1D(task, h,inch, x,incx, y,incy);
    vslConvDeleteTask(&task);
    return status;
}
 
  

Function sconf1 for Single-Threaded Calculations

#include "mkl_vsl.h"
int sconf1(
    int init,
    float h[], int inc1h,
    float x[], int inc1x, int inc2x,
    float y[], int inc1y, int inc2y,
    int nh, int nx, int m, int iy0, int ny,
    void* aux1, int naux1, void* aux2, int naux2)
{
    int status;
    /* assume that aux1!=0 and naux1 is big enough */
    VSLConvTaskPtr* task = (VSLConvTaskPtr*)aux1;
    if (init != 0)
        /* initialization: */
        status = vslsConvNewTaskX1D(task,VSL_CONV_MODE_FFT,
         nh,nx,ny, h,inc1h);
    if (init == 0) {
        /* calculations: */
        int i;
        vslConvSetStart(*task, &iy0);
        for (i=0; i<m; i++) {
         float* xi = &x[inc2x * i];
         float* yi = &y[inc2y * i];
         /* task is implicitly committed at i==0 */
         status = vslsConvExecX1D(*task, xi, inc1x, yi, inc1y);
        };
    };
    vslConvDeleteTask(task);
    return status;
}

Using Multiple Threads

For functions such as sconf1 described in the previous example, parallel calculations may be more preferable instead of cycling. If m>1, you can use multiple threads for invoking the task execution against different data sequences. For such cases, use task copy routines to create m copies of the task object before the calculations stage and then run these copies with different threads. Ensure that you make all necessary parameter adjustments for the task (using Task Editors) before copying it.

The sample code in this case may look as follows:

if (init == 0) {
    int i, status, ss[M];
    VSLConvTaskPtr tasks[M];
    /* assume that M is big enough */
    . . .
    vslConvSetStart(*task, &iy0);
    . . .
    for (i=0; i<m; i++)
        /* implicit commitment at i==0 */
        vslConvCopyTask(&tasks[i],*task);
    . . .

Then, m threads may be started to execute different copies of the task:

. . .
        float* xi = &x[inc2x * i];
        float* yi = &y[inc2y * i];
        ss[i]=vslsConvExecX1D(tasks[i], xi,inc1x, yi,inc1y);
    . . .

And finally, after all threads have finished the calculations, overall status should be collected from all task objects. The following code signals the first error found, if any:

    . . .
    for (i=0; i<m; i++) {
        status = ss[i];
        if (status != 0) /* 0 means "OK" */
            break;
    };
    return status;
}; /* end if init==0 */

Execution routines modify the task internal state (fields of the task structure). Such modifications may conflict with each other if different threads work with the same task object simultaneously. That is why different threads must use different copies of the task.

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201