Recommendation to rebuild Performance Scaled Messaging V3 (PSM3) to resolve error with RHEL* 8.3, Intel® Ethernet Fabric Suite (Intel® EFS)
- Performance Scaled Messaging V3 (PSM3) is rebuilt on out-of-box Red Hat Enterprise Linux* 8.3 (RHEL* 8.3).
- The E810 software is installed (downgrades rdma-core), and NIC drivers are rebuilt.
- Encountered Performance Scaled Messaging V3 (PSM3) failure when testing RHEL 8.3 freshly provisioned fabric with Intel® Ethernet Fabric Suite (Intel® EFS) v11.1.0.0.20:
[0] MPI startup(): Intel(R) MPI Library, Version 2019 Update 10 Build 20210120 (id: d7908565b)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.11.0-impi
Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(138)........:
MPID_Init(1139)..............:
MPIDI_OFI_mpi_init_hook(1207): OFI addrinfo() failed (ofi_init.c:1207:MPIDI_OFI_mpi_init_hook:No data available)
Performance Scaled Messaging V3 (PSM3) must be rebuilt with the same or lower version of rdma-core.
Any installation that modifies the version of rdma-core can cause this issue. This is not a bug but more dealing with proper Linux* configuration and runtime maintenance.