Intel® MPI Library Developer Reference for Windows* OS

ID 768734
Date 6/24/2024
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Main Thread Pinning

Use this feature to pin a particular MPI thread to a corresponding set of CPUs within a node and avoid undesired thread migration. This feature is available on operating systems that provide the necessary kernel interfaces.

Processor Identification

The following schemes are used to identify logical processors in a system:

  • System-defined logical enumeration
  • Topological enumeration based on three-level hierarchical identification through triplets (package/socket, core, thread)

The number of a logical CPU is defined as the corresponding position of this CPU bit in the kernel affinity bit-mask. Use the cpuinfo utility, provided with your Intel MPI Library installation

The three-level hierarchical identification uses triplets that provide information about processor location and their order. The triplets are hierarchically ordered (package, core, and thread).

See the example for one possible processor numbering where there are two sockets, four cores (two cores per socket), and eight logical processors (two processors per core).

NOTE:
Logical and topological enumerations are not the same.
Logical Enumeration
0 4 1 5 2 6 3 7
Hierarchical Levels
Socket 0 0 0 0 1 1 1 1
Core 0 0 1 1 0 0 1 1
Thread 0 1 0 1 0 1 0 1
Topological Enumeration
0 1 2 3 4 5 6 7

Use the cpuinfo utility to identify the correspondence between the logical and topological enumerations. See Processor Information Utility for more details.

Default Settings

If you do not specify values for any main thread pinning environment variables, the default settings below are used. For details about these settings, see Environment Variables and Interoperability with OpenMP API.

  • I_MPI_PIN=on
  • I_MPI_PIN_RESPECT_CPUSET=on
  • I_MPI_PIN_RESPECT_HCA=on
  • I_MPI_PIN_CELL=unit
  • I_MPI_PIN_DOMAIN=auto:compact
  • I_MPI_PIN_ORDER=bunch
NOTE:
If hyperthreading is on, the number or processes on the node is greater than the number of cores and no one process pinning environment variable is set. For better performance, the "spread" order will automatically be used instead of the default "compact" order.