Intel® MPI Library Developer Guide for Linux* OS

ID 768728
Date 10/31/2024
Public
Document Table of Contents

Troubleshooting

This section provides the troubleshooting information on typical MPI failures with corresponding output messages and behavior when a failure occurs.

If you encounter errors or failures when using the Intel® MPI Library, take the following general troubleshooting steps first:

  1. Check the Intel® MPI Library System Requirements section and the Known Issues section in the Intel® MPI Library Release Notes.
  2. Check accessibility of the hosts. Run a simple non-MPI application (for example, the hostname utility) on the problem hosts using mpirun. For example:
    $ mpirun -ppn 1 -n 2 -hosts node01,node02 hostname  
    node01 
    node02

    This may help reveal an environmental problem (such as the MPI remote access mechanism is not configured properly), or a connectivity problem (such as unreachable hosts).

  3. Run the MPI application with debug information enabled: set the environment variables I_MPI_DEBUG=6 and/or I_MPI_HYDRA_DEBUG=on. Increase the integer value of debug level to get more information. This action helps narrow down to the problematic component.
  4. If you have the availability, download and install the latest version of Intel MPI Library from the official product page and check if your problem persists.
  5. If the problem still persists, you can submit a ticket via the Support page, or ask experts on the community forum.