OpenMP* Pragmas

Intel® C++ Compiler Classic Developer Guide and Reference

Download PDF

ID 767249

Date 12/16/2022

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

OpenMP* Pragmas

This is a summary of the OpenMP* pragmas supported in the Intel® C++ Compiler Classic. For detailed information about the OpenMP API, see the OpenMP Application Program Interface Version 5.1 specification, which is available from the OpenMP web site.

PARALLEL Pragma

Use this pragma to form a team of threads and execute those threads in parallel.

Pragma	Description
omp parallel	Specifies that a structured block should be run in parallel by a team of threads.

TASKING Pragma

Use these pragmas for deferring execution.

Pragma	Description
omp task	Specifies a code block whose execution may be deferred.
omp taskloop	Specifies that the iterations of one or more associated for loops should be executed using OpenMP tasks. The iterations are distributed across tasks that are created by the construct and scheduled to be executed in parallel by the current team.

WORKSHARING Pragmas

Use these pragmas to share work among a team of threads.

Pragma	Description
omp for	Specifies a work-sharing loop. Iterations of the loop are executed in parallel by the threads in the team.
omp sections	Defines a set of structured blocks that will be distributed among the threads in the team.
omp single	Specifies that a block of code is to be executed by only one thread in the team.

SYNCHRONIZATION Pragmas

Use these pragmas to synchronize between threads.

Pragma	Description
omp atomic	Specifies a computation that must be executed atomically.
omp barrier	Specifies a point in the code where each thread must wait until all threads in the team arrive.
omp critical	Specifies a code block that is restricted to access by only one thread at a time.
omp flush	Identifies a point at which a thread's temporary view of memory becomes consistent with the memory.
omp master	Specifies a code block that must be executed only once by the primary thread of the team.
omp ordered	Specifies a block of code that the threads in a team must execute in the natural order of the loop iterations, or as a stand-alone directive, specifies cross-iteration dependences in a doacross loop-nest. The following clauses are available as Intel-specific extensions of the OpenMP* specification: monotonic Specifies a block of code in which the value of the new list item on each iteration of the associated SIMD loop(s) corresponds to the value of the original list item before entering the associated loop, plus the number of the iterations for which the conditional update happens prior to the current iteration, times linear-step. The value corresponding to the sequentially last iteration of the associated loop(s) is assigned to the original list item. Use with the simd clause. overlap Specifies a block of code that has to be executed scalar for overlapping inx values and parallel for different inx values within SIMD loop. Use with the simd clause.
omp taskgroup	Causes the program to wait until the completion of all enclosed and descendant tasks.
omp taskwait	Specifies a wait on the completion of child tasks generated since the beginning of the current task.
omp taskyield	Specifies that the current task can be suspended at this point in favor of execution of a different task.

Data Environment Pragmas

Use these pragmas to affect the data environment.

Pragma	Description
omp scan	Specifies a scan computation that updates each list item in each iteration of the loop.
omp threadprivate	Specifies a list of globally-visible variables that will be allocated private to each thread.

Offload Target Control Pragmas

Use these pragmas to control execution on one or more offload targets.

Pragma	Description
omp distribute	Specifies that the iterations of one or more loops should be distributed among the initial threads of all thread teams in a league.
omp target enter data
omp target exit data
omp teams	Creates a league of thread teams inside a target region to execute a structured block in the initial thread of each team.

Vectorization Pragmas

Use these pragmas to control execution on vector hardware.

Pragma	Description
omp simd	Transforms the loop into a loop that will be executed concurrently using SIMD instructions. The early_exit clause is an Intel-specific extension of the OpenMP* specification. The following clauses are available as Intel-specific extensions of the OpenMP* specification: assert Specifies that the compiler generates an error message if the loop is not vectorized by whatever reason. early_exit Allows vectorization of multiple exit loops. When this clause is specified: Each operation before last lexical early exit of the loop may be executed as if early exit were not triggered within the SIMD chunk. After the last lexical early exit of the loop, all operations are executed as if the last iteration of the loop was found. Each list item specified in the linear clause is computed based on the last iteration number upon exiting the loop. The last value for linear clauses and conditional lastprivates clauses are preserved with respect to scalar execution. The last value for reductions clauses are computed as if the last iteration in the last SIMD chunk was executed up on exiting the loop. The shared memory state may not be preserved with regard to scalar execution. Exceptions are not allowed.
omp declare simd	Creates a version of a function that can process multiple arguments using Single Instruction Multiple Data (SIMD) instructions from a single invocation from a SIMD loop.

Pragma

Description

omp simd

Transforms the loop into a loop that will be executed concurrently using SIMD instructions.

The early_exit clause is an Intel-specific extension of the OpenMP* specification.

The following clauses are available as Intel-specific extensions of the OpenMP* specification:

assert

Specifies that the compiler generates an error message if the loop is not vectorized by whatever reason.

early_exit

Allows vectorization of multiple exit loops. When this clause is specified:

Each operation before last lexical early exit of the loop may be executed as if early exit were not triggered within the SIMD chunk.
After the last lexical early exit of the loop, all operations are executed as if the last iteration of the loop was found.
Each list item specified in the linear clause is computed based on the last iteration number upon exiting the loop.
The last value for linear clauses and conditional lastprivates clauses are preserved with respect to scalar execution.
The last value for reductions clauses are computed as if the last iteration in the last SIMD chunk was executed up on exiting the loop.
The shared memory state may not be preserved with regard to scalar execution.
Exceptions are not allowed.

omp declare simd

Creates a version of a function that can process multiple arguments using Single Instruction Multiple Data (SIMD) instructions from a single invocation from a SIMD loop.

Cancellation Constructs

Pragma	Description
omp cancel	Requests cancellation of the innermost enclosing region of the type specified, and causes the encountering task to proceed to the end of the cancelled construct.
omp cancellation point	Defines a point at which implicit or explicit tasks check to see if cancellation has been requested for the innermost enclosing region of the type specified. This construct does not implement a synchronization between threads or tasks.

User-Defined Reduction Pragma

Use this pragma to define reduction identifiers that can be used as reduction operators in a reduction clause.

Pragma	Description
omp declare reduction	Declares User-Defined Reduction (UDR) functions (reduction identifiers) that can be used as reduction operators in a reduction clause.

Combined and Composite Pragmas

Use these pragmas as shortcuts for multiple pragmas in sequence. A combined construct is a shortcut for specifying one construct immediately nested inside another construct. A combined construct is semantically identical to that of explicitly specifying the first construct containing one instance of the second construct and no other statements.

A composite construct is composed of two constructs but does not have identical semantics to specifying one of the constructs immediately nested inside the other. A composite construct either adds semantics not included in the constructs from which it is composed or the nesting of the one construct inside the other is not conforming.

Pragma	Description
omp distribute parallel for ¹	Specifies a loop that can be executed in parallel by multiple threads that are members of multiple teams.
omp distribute parallel for simd¹	Specifies a loop that will be executed in parallel by multiple threads that are members of multiple teams. It will be executed concurrently using SIMD instructions.
omp distribute simd ¹	Specifies a loop that will be distributed across the primary threads of the teams region. It will be executed concurrently using SIMD instructions.
omp for simd¹	Specifies that the iterations of the loop will be distributed across threads in the team. Iterations executed by each thread can also be executed concurrently using SIMD instructions.
omp parallel for	Provides an abbreviated way to specify a parallel region containing only a FOR construct.
omp parallel for simd	Specifies a parallel construct that contains one for simd construct and no other statement.
omp parallel sections	Specifies a parallel construct that contains only a sections construct.
	Creates a device data environment and executes the parallel region on that device.
	Provides an abbreviated way to specify a target construct that contains an omp target parallel for construct and no other statement between them.
	Specifies a target construct that contains an omp target parallel for simd construct and no other statement between them.
	Specifies a target construct that contains an omp simd construct and no other statement between them.
omp target teams	Creates a device data environment and executes the construct on the same device. It also creates a league of thread teams with the primary thread in each team executing the structured block.
omp target teams distribute	Creates a device data environment and then executes the construct on that device. It also specifies that loop iterations will be distributed among the primary threads of all thread teams in a league created by a teams construct.
omp target teams distribute parallel for	Creates a device data environment and then executes the construct on that device. It also specifies a loop that can be executed in parallel by multiple threads that are members of multiple teams created by a teams construct.
omp target teams distribute parallel for simd	Creates a device data environment and then executes the construct on that device. It also specifies a loop that can be executed in parallel by multiple threads that are members of multiple teams created by a teams construct. The loop will be distributed across the teams, which will be executed concurrently using SIMD instructions.
omp target teams distribute simd	Creates a device data environment and then executes the construct on that device. It also specifies that loop iterations will be distributed among the primary threads of all thread teams in a league created by a teams construct. It will be executed concurrently using SIMD instructions.
omp taskloop simd ¹	Specifies a loop that can be executed concurrently using SIMD instructions and that those iterations will also be executed in parallel using OpenMP* tasks.
omp teams distribute	Creates a league of thread teams and specifies that loop iterations will be distributed among the primary threads of all thread teams in the league.
omp teams distribute parallel for	Creates a league of thread teams and specifies that the associated loop can be executed in parallel by multiple threads that are members of multiple teams.
omp teams distribute parallel for simd	Creates a league of thread teams and specifies that the associated loop can be executed concurrently using SIMD instructions in parallel by multiple threads that are members of multiple teams.
omp teams distribute simd	Creates a league of thread teams and specifies that the associated loop will be distributed across the primary threads of the teams and executed concurrently using SIMD instructions.

Footnotes:

¹ This directive specifies a composite construct.

Parent topic: OpenMP* Support

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® C++ Compiler Classic Developer Guide and Reference

OpenMP* Pragmas

PARALLEL Pragma

TASKING Pragma

WORKSHARING Pragmas

SYNCHRONIZATION Pragmas

Data Environment Pragmas

Offload Target Control Pragmas

Vectorization Pragmas

Cancellation Constructs

User-Defined Reduction Pragma

Combined and Composite Pragmas