Application Performance Snapshot User Guide for Linux* OS

ID 772048
Date 10/31/2024
Public

Controlling Amount of Collected Data

Application Performance Snapshot (APS) provides several methods to control the amount of collected data. This enables you to reduce profiling overhead and focus on relevant application sections.

Collection Control API

By default, APS collects statistics for the whole application run. In some cases, it is important to enable or disable the collection for a specific application phase. For example, you may want to focus on the most time consuming section or disable collection for the initialization or finalization phases. APS provides APIs to control data collection from source code.

For MPI applications, use the MPI_Pcontrol() API. Call MPI_Pcontrol(0) to pause data collection, call MPI_Pcontrol(1) to resume it again. For more information, refer to Region Control with MPI_Pcontrol.

For non-MPI applications, the Instrumentation and Tracing Technology API (ITT API) is also available. Before using ITT API, you need to configure your system. For instructions, refer to the Configure your Build System page of the VTune Profiler User Guide. After the system is configured, you can use ITT API. Call __itt_pause() and __itt_resume()to pause and resume data collection, respectively.

By default, profiling is enabled when the application is launched. To launch the application without profiling, use the -start-paused option. Profiling will begin automatically with the first call of MPI_Pcontrol(1) or __itt_resume(). This can be useful to skip the initialization phase.

MPI Imbalance Collection

By default, APS collects and reports on the MPI imbalance (idle time). The APS_IMBALANCE_TYPE environment variable allows for additional control over how the imbalance is calculated. The default level changes based on the setting of the APS_STAT_LEVEL environment variable. To change the level, update the APS_IMBALANCE_TYPE environment variable. For example:

export APS_IMBALANCE_TYPE=2

Value

Description

APS_IMBALANCE_TYPE=0

Default value if APS_STAT_LEVEL=1

Turns off the imbalance calculation. Disabling the imbalance calculation reduces the overhead of APS, but does not provide information about MPI imbalance, which is an important statistic as part of application performance analysis.

For the Intel® MPI Library, imbalance (Idle time) is reported at this level.

APS_IMBALANCE_TYPE=1

Default value. Only Intel® MPI can collect the imbalance requested by this setting. If you use any other MPI implementation, the behavior is similar to APS_IMBALANCE_TYPE=0.

APS_IMBALANCE_TYPE=2

Imbalance is calculated by calling MPI_Barrier before any collective operation and measuring the time of the call. This can provide data about application imbalance. For example, when some ranks do their computation work faster than others, they need to wait for other ranks to start the MPI collective operations. The wait time can be calculated using the MPI_Barrier call.

Filter Data by Type

APS allows you to filter statistics collection by type: MPI statistics, OpenMP* statistics, or hardware counters statistics. By default, data of all types is collected.

To specify data collection types, use the -c (--collection-mode) option. As an argument, specify a comma-separated list of values mpi, omp, or hwc to enable statistics collection of the specified types. Use the all argument to enable statistics collection of all types (default).

For example, to disable hardware counters statistics in an MPI application:


mpirun -n 2 aps -c mpi,omp ./myapp

Set MPI Level of Detail

For MPI applications, APS offers a multi-level approach to collecting statistics. There are five levels of detail that vary by the amount of data collected. By default, level 1 is enabled. To change the level, use the APS_STAT_LEVEL environment variable. For example:


export APS_STAT_LEVEL=2

This table summarizes available levels of detail.

Level

Information is collected about

1 (default)

MPI functions and their times

2

MPI functions and amount of transmitted data

3

MPI functions, communicators, and message sizes

4

MPI functions, communicators, communication directions and aggregated traffic for each direction

5

MPI functions, communicators, message sizes, and communication directions

Level 5 may provide too much information if an application uses a lot of communicators. In this case, consider reducing the statistics level. Also, some diagrams may be unavailable for statistics levels 1–4, depending on the availability of the information provided at that level.

NOTE:

The APS_STAT_LEVEL value impacts the default value of the APS_IMBALANCE_TYPE environment variable. For more information, see MPI Imbalance Collection.

Collect Internal IDs of Communicators

With versions of APS as well as Intel MPI that are 2019 Update 4 or newer, you can use APS to collect internal IDs of communicators when you maintain the same number of nodes and processes per node between runs. In this case, the internal IDs do not change. To enable this function:

  • Set the APS_COLLECT_COMM_IDS environment variable to 1.
    
    export APS_COLLECT_COMM_IDS=1
  • Set APS_STAT_LEVEL to 3 or higher. These are the only levels where APS collects information about communicators.
    
    export APS_STAT_LEVEL=3