Intel® Trace Analyzer and Collector User and Reference Guide

ID 767272
Date 10/31/2024
Public
Document Table of Contents

Tracing MPI Load Imbalance

Normally, tracing of all MPI events results in a large size of the trace file, even for relatively small applications. To reduce the trace file size, but be able to get an impression of the application bottlenecks, you can trace only the MPI functions that cause application load imbalance. That is, an MPI function is traced only if it was idle at some point of the application run, causing the imbalance. This functionality is implemented in the libVTim library.

You can enable source code locations tracing to identify the regions in source code that caused the imbalance (see Recording Source Location Information).

To generate an imbalance trace file, link your application with the libVTim library, using the -trace-imbalance option of mpirun, or one of the methods described here. For example:

$ mpirun -n 2 -trace-imbalance ./myApp

Open the generated .stf file to view the results. Intel® Trace Analyzer displays only the regions of MPI idle time. As a consequence, time values for MPI functions are equal to their idle time.

Known Limitations

  • This feature is currently available on Linux* OS only.
  • Point-to-point communication patterns displayed by Intel Trace Collector may be unreliable, because the libVTim library skips tracing of certain functions.
  • The library traces only those MPI functions that can potentially generate load imbalance. Therefore, all non-blocking operations are not traced.
  • The library does not trace user defined events (see Tracing User Defined Events), OpenMP* regions (see Recording OpenMP* Regions Information), or system calls (see Tracing System Calls).
  • Intel Trace Collector cannot run idealization for trace files generated by libVTim.