Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 7/13/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

OpenMP* Memory Spaces and Allocators

For storage and retrieval variables, OpenMP* provides memory known as memory spaces. Different memory spaces have different traits. Depending on how a variable is to be used and accessed determines which memory space is appropriate for allocation of the variable.

Each memory space has a unique allocator that is used to allocate and deallocate memory in that space. The allocators allocate variables in contiguous space that does not overlap any other allocation in the memory space. Multiple memory spaces with different traits may map to a single memory resource.

The behavior of the allocator is affected by the allocator traits that you specify. The allocator traits, their possible values, and their default values are shown in the following table:

Allocator Trait

Values That Can Be Specified

Default Value

access

  • all
  • cgroup
  • pteam
  • thread

All

alignment

A positive integer value that is a power of 2 specifying number of bytes

1 byte

fallback

  • abort_fb
  • allocator_fb
  • default_mem_fb
  • null_fb

default_mem_fb

fb_data

An allocator handle

None

partition

  • blocked
  • environment
  • interleaved
  • nearest

environment

pinned

  • true
  • false

false

pool_size

a positive integer value

Implementation defined

sync_hint

  • contended
  • uncontended
  • private
  • serialized

contended

The access trait specifies the accessibility of the allocated memory. The following are values you can specify for access:

  • all

    This value indicates that the allocated memory must be accessible by all threads in the device where the memory allocation occurs.

    This is the default setting.

  • cgroup

    This value indicates that the allocated memory must be accessible by all threads of the same contention group as the thread that requested the allocation. Accessing the allocated memory thread that is not part of the same contention group results in undefined behavior.

  • pteam

    This value indicates that the allocated memory is accessible by all threads that bind to the same parallel region as the thread that requests the allocations. Access to the memory by a thread that does not bind to the same parallel region as the thread that allocated the memory results in undefined behavior.

  • thread

    This value indicates that the memory allocated is accessible only by the thread that allocated it. Attempts to allocate the memory by another thread result in undefined behavior.

The alignment trait specifies how allocated variables will be aligned. Variables will be byte-aligned to at least the value specified for this trait. The default setting is 1 byte. Alignment can also be affected by directives and OpenMP runtime allocator routines that specify alignment requirements.

The fallback trait indicates how an allocator behaves if it is unable to satisfy an allocation request. The following are values you can specify for fallback:

  • abort_fb

    This value indicates that the program terminates if the allocation request fails.

  • allocator_fb

    If this value is specified and the allocation request fails, the allocation will be tried by the allocator specified by the fb_data trait.

  • default_mem_fb

    This value indicates that a failed allocation request will be retried in the omp_default_mem_space memory space. All traits for the omp_default_mem_space allocator should be set to the default trait values, except the fallback trait should be set to null_fb. This is the default setting.

  • null_fb

    This value indicates the allocator returns a zero value when an allocation request fails.

The fb_data trait lets you specify a fall back allocator to be used if the requested allocator fails to satisfy the allocation request. The fallback trait of the failing allocator must be set to allocator_fb in order for the allocator specified by the fb_data trait to be used.

The partition trait describes the partitioning of allocated memory over the storage resources represented by the memory space of the allocator. The following are values you can specify for partition:

  • blocked

    This value indicates the allocated memory is partitioned into blocks of memory of approximately equal size with one block per storage resource.

  • environment

    This value indicates the allocated memory placement is determined by the runtime execution environment. This is the default setting.

  • interleaved

    This value indicates the allocated memory is distributed in a round-robin fashion across the storage resources.

  • nearest

    This value indicates that the allocated memory will be placed in the storage resource nearest to the thread that requested the allocation.

If the pinned trait has the value true, the allocator ensures each allocation made by the allocator will remain in the storage resource at the same location where it was allocated until it is deallocated. The default setting is false.

The value of pool_size is the total number of bytes of storage available to an allocator when there have been no allocations. The following affect pool_size:

  • If the access trait has the value all, the value of pool_size is the limit for all allocations for all threads having access to the allocator.

  • If the access trait of the allocator has the value cgroup, the value of pool_size is the limit for allocations made from the threads within the same contention group.

  • For allocators with the access access trait value of pteam, the value of pool_size is the limit for allocations made within the same parallel team.

  • If the access trait has the value thread, the value of pool_size is the limit for allocations made from each thread using the allocator.

  • An allocation request for more space than the value of pool_size results in the allocator not fulfilling the allocation request.

The sync_hint trait describes the way that multiple threads can access an allocator. The following are values you can specify for sync_hint:

  • contended or uncontended

    Value contended indicates that many threads are anticipated to make simultaneous allocation requests while the value uncontended indicates that few threads are anticipated to make simultaneous allocation. The default setting is contended.

  • private

    This value indicates that all allocation requests will come from the same thread. Specifying private when this is not the case and two or more threads make allocation requests by the same allocator results in undefined behavior.

  • serialized

    This value indicates that only one thread will request an allocation at a given time. The behavior is undefined if two threads request an allocation simultaneously by an allocator whose sync_hint value is serialized.

There are five predefined memory spaces in OpenMP:

  • The system default memory is referred to as omp_default_mem_space.

  • Large capacity memory is referred to as omp_large_cap_mem_space.

  • High bandwidth memory is referred to as omp_high_bw_mem_space.

  • Low latency memory is referred to as omp_low_lat_mem_space.

  • Memory designed for optimal storage of constant values is referred to as omp_const_mem_space.

    It can be initialized with compile-time constant expressions or by using a firstprivate clause.

    Writing to variables in omp_const_mem_space results in undefined behavior.

There are three additional predefined memory spaces that are extensions to the OpenMP standard:

  • omp_target_host_mem_space is host memory that is accessible by the device.

  • omp_target_shared_mem_space is memory that can migrate between the host and the device.

  • omp_target_device_mem_space is memory that is accessible to the device.

The following table shows the predefined memory allocators, the memory space they are associated with, and the non-default memory trait values they possess.

Allocator Name

Associated Memory Space

Non-Default Trait Values

omp_default_mem_alloc

omp_default_mem_space

fallback=null_fb

omp_large_cap_mem_alloc

omp_large_cap_mem_space

none

omp_low_lat_mem_alloc

omp_low_lat_mem_space

none

omp_high_bw_mem_alloc

omp_high_bw_mem_space

none

omp_const_mem_alloc

omp_const_mem_space

none

omp_cgroup_mem_alloc

implementation/system defined

access=cgroup

omp_pteam_mem_alloc

implementation/system defined

access=pteam

omp_thread_mem_alloc

implementation/system defined

access=thread

omp_target_host_mem_alloc

omp_target_host_mem_space

none

omp_target_shared_mem_alloc

omp_target_shared_mem_space

none

omp_target_device_mem_alloc

omp_target_device_mem_space

none