User Defined Induction in OpenMP* with Intel® C++ Compiler

ID 672941
Updated 10/4/2018
Version Latest
Public

author-image

By

Contents

  1. Induction Overview
  2. User Defined Induction(UDI)
  3. Parallelization Example
  4. Vectorization Example

Induction Overview

Intel® C++ Compiler 19.0 update 1 supports General Induction, a proposed OpenMP* 5.0 feature as an extension to the existing linear clause. With the linear clause, OpenMP provides a way to specify linear inductive variables with respect to the loop index. However, there are significant limitations with the linear clause: the variables are restricted to be of integral or pointer types, the step has to be of integral type, and the induction operation is limited to addition. The newly proposed induction clause provides  a mechanism to express general induction that allows more data types and induction operations, including user-defined types and operations. 

The induction clause with the following syntax is to be used with the OpenMP loop, distribute and simd constructs: 

induction(induction-id:list:step)

induction-id: can be a built-in op(+,-,*,/) or user-defined

list variables can be of integral, FP and non-POD types

step: the step expression must be supported by the induction op

Here are some examples of the induction clause:

  •  induction( + : x, y : 1 ) is equivalent to linear( x, y ) if x, y are integral
  •  induction( * : x : s ) describes the nonlinear induction xi = xi-1 * s
  •  induction( foo : x : s ) uses a user-defined induction operator “foo”; x and s can be of different non-POD types

Note: The induction clause can also be used in a short form induction(list[:step]) similar to the existing linear clause. The omitted induction-id is the language’s built-in + operator. Variables in list must be of a type supported by the built-in + and if step is omitted, it is assumed to be 1. This helps in easy porting of existing code with the linear clause. 

Following is an example for using induction to evaluate a polynomial (i.e., compute ΣNi=0 cixi)

#define N 10

int main()
{
    float c[N];            // values of the coefficients

    float x= 1.23F;    // value of x to evaluate the polynomial

    float xi = 1.0F;    // x^i; initial value x^0 == 1

    float value = 0.0; // accumulator for the result

#pragma omp simd reduction(+:value) induction(* : xi : x)

    for(int i=0; i<=N; i++) {

        value += c[i] * xi;

        xi *= x;
    }
    return 0;
}

User Defined Induction(UDI)

To express induction beyond the built-in operators and data types, a declare induction directive is proposed that is syntactically similar to the declare reduction directive

#pragma omp declare induction ( induction-id : induction-type step-type : inductor ) [collector( collector )] 

  • induction-id : identifier for the operation, to be used in an induction clause
  • induction-type : type specifier for the induction variables
  • step-type : type specifier for the step expression
  • inductor: specifies the inductive operation: x = x + s
    • Uses keywords omp_out to represent x and omp_step for s
    • C++ Example: omp_out = omp_out + omp_step, where + is overloaded
    • C Example: add(&omp_out, omp_step)
  • collector: closed form is xi = x0 + ( s * i )
    • Uses keywords omp_step to represent s and omp_index for i
    • C++ Example: omp_step = omp_step * omp_index, where * may be overloaded
    • C Example: cs(&omp_step, omp_index)

Parallelization Example :

The C parallelization example below uses UDI to express an induction involving the struct a. The inductor is the function add. The collector is provided, so the initial value of a for each thread is the closed-form computed by calling add(&a,5*lb), where lb is the lower-bound of the index for the thread.

typedef struct{ float x; int y; } A;

void add(A *a, int st) { a->x += st; a->y += st; }

#pragma omp declare induction( op : A : int : add(&omp_out, omp_step)) \ 
collector( omp_step = omp_index * omp_step )

A a = {12.3, 456};

#pragma omp parallel for induction( op : a : 5 )

for(int i=0; i<N; i++) { work(a); add(&a, 5); }

Vectorization Example:

The C++ vectorization example below uses UDI to express an induction involving a non-POD variable and step. The + operator in the inductor and the * operator in the collector are overloaded to support classes A and S.

class A; // class of the induction variable

class S; // class of the step expression

#pragma omp declare induction ( op2 : A : S : omp_out = omp_out + omp_step ) \
collector ( omp_step = omp_index * omp_step )

...

A a; S s;  // initialized by constructors

...

#pragma omp simd induction( op2 : a : s )

for(int i=0; i<N; i++) { work(a); a=a+s; }
This proposed feature is currently supported only in update 1 of the Intel C++ compiler 19.0 and is expected to reach standardization in OpenMP 5.0.