Overview
Intel® MPI Library 2019 has transitioned to exclusively using libfabric to facilitate communications. The libfabric infrastructure is based around providers to handle implementation of message transfer for various hardware vendors. The MLX provider is implemented to facilitate usage of InfiniBand* hardware.
Rationale
Stability and performance when utilizing InfiniBand* was sub-optimal in the initial and early update releases of Intel® MPI Library. The MLX provider in libfabric addresses these concerns.
Availability
The MLX provider is available in Intel® MPI Library 2019 Update 5 for Linux* as a technical preview, and as a full feature in Intel® MPI Library 2019 Update 6 for Linux*.
Requirements
- Intel® MPI Library 2019 Update 5 or higher
- Mellanox UCX* Framework v1.4 or higher
Basic Usage
Ensure you are using the libfabric version provided with Intel® MPI Library. In Intel® MPI Library 2019 Update 5, the MLX provider is a technical preview, and will not be selected by default. To enable it, set FI_PROVIDER=mlx
Intel® MPI Library 2019 Update 6 and later uses the MLX by default if InfiniBand* is detected at runtime.
Performance Tuning Options
Option | Usage | Reference |
---|---|---|
I_MPI_COLL_EXTERNAL | Set to 1 to enable external collective operations (HCOLL) | I_MPI_ADJUST Family Environment Variables |
Autotuner | Automatically tune application at the beginning of the run. | Autotuning |
Limitations
- Dynamic process management is not yet supported as of Intel® MPI Library 2019 Update 6. Support will be implemented in a future release.
- Older InfiniBand hardware doesn't support all of the expected transports. To check and resolve transport issues:
$ucx_info -d | grep Transport
Output should include dc, rc, and ud transports. On older hardware, the dc transport will likely be missing. As a workaround, set
UCX_TLS=rc,ud,sm,self
If none of the required transports are present, this is usually due to a driver misconfiguration, missing libraries, or other fabric software problems. Please recheck your UCX configuration using one of the following:
$ibv_devinfo $lspci | grep Mellanox