Best Known Methods: Firewall Blocks MPI Communication among Nodes

ID 659077
Updated 12/12/2017
Version Latest
Public

This article shares three methods you can use when dealing with the firewall blocking the Message Passing Interface (MPI) communication among many machines. For example, when running an MPI program between two machines, you might see a communication error like this:

[proxy:0:1@knl-sb0] HYDU_sock_connect (../../utils/sock/sock.c:268): unable to connect from "knl-sb0" to "knc4" (No route to host) [proxy:0:1@knl-sb0] main (../../pm/pmiserv/pmip.c:461): unable to connect to server knc4 at port 39652 (check for firewalls!)

This symptom suggests the MPI ranks cannot communicate with each other, because the firewall blocks the MPI communication.

Below are three methods to help you solve this problem.

First Method: Stop the firewalld deamon

The first and simplest method is to stop the firewall on the machine where you run the MPI program. First, check the status of the firewalld deamon on a Red Hat Enterprise Linux* (RHEL*) and CentOS* system.

$ systemctl status firewalld firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2017-12-05 21:36:10 PST; 12min ago Main PID: 47030 (firewalld) CGroup: /system.slice/firewalld.service 47030 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

The output shows that firewalld is running. You can stop it and verify its status with the following command lines:

$ sudo systemctl stop firewalld $ systemctl status firewalld firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled) Active: inactive (dead) since Tue 2017-12-05 21:51:19 PST; 4s ago Process: 48062 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS) Main PID: 48062 (code=exited, status=0/SUCCESS)

With firewalld now stopped, you should be able to run your MPI program between the two machines (in this example, I use the Intel® MPI Benchmarks  IMB-MPI1 as the MPI program).

$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 21:51:45 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.57 16.57 16.57 0.00 1 1000 16.57 16.57 16.57 0.12 2 1000 16.52 16.53 16.53 0.24 4 1000 16.58 16.58 16.58 0.48 8 1000 16.51 16.51 16.51 0.97 16 1000 16.20 16.20 16.20 1.98 32 1000 16.32 16.32 16.32 3.92 64 1000 16.55 16.55 16.55 7.73 128 1000 16.65 16.65 16.65 15.37 256 1000 29.07 29.09 29.08 17.60 512 1000 30.75 30.76 30.76 33.29 1024 1000 31.13 31.15 31.14 65.75 2048 1000 33.58 33.58 33.58 121.98 4096 1000 34.79 34.80 34.80 235.38

However, this method can pose a problem, because this machine is vulnerable to security issues. It may not be suitable in some scenarios. In that case, start the firewalld deamon again, and then try the second method.

$ sudo systemctl start firewalld

Second Method: Use Rich Rule in firewalld

This method uses the Rich Rule feature in firewalld to accept only IP v4 packets from the other machine whose IP address is 10.23.3.61.

$ sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="10.23.3.61" accept' Success

Verify the rule you just added.

$ firewall-cmd --list-rich-rules rule family="ipv4" source address="10.23.3.61" accept

Run the MPI program.

$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 22:01:17 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.88 16.88 16.88 0.00 1 1000 16.86 16.86 16.86 0.12 2 1000 16.57 16.57 16.57 0.24 4 1000 16.55 16.55 16.55 0.48 8 1000 16.40 16.40 16.40 0.98 16 1000 16.29 16.29 16.29 1.96 32 1000 16.63 16.63 16.63 3.85 64 1000 16.87 16.87 16.87 7.59 128 1000 17.03 17.04 17.03 15.03 256 1000 27.58 27.60 27.59 18.55 512 1000 27.52 27.54 27.53 37.18 1024 1000 26.87 26.89 26.88 76.16 2048 1000 28.62 28.64 28.63 143.02 4096 1000 30.27 30.27 30.27 270.62 ^C[mpiexec@knc4] Sending Ctrl-C to processes as requested [mpiexec@knc4] Press Ctrl-C again to force abort

You can remove a Rich Rule that you defined by entering the following command:

$ sudo firewall-cmd --remove-rich-rule='rule family="ipv4" source address="10.23.3.61" accept' success

Third Method: Add a Rule in iptables-service to Accept Packets from Other Machines

In addition to firewalld, iptables-service can also be used to manage the firewall on a RHEL and CentOS system. In this method, you can add a rule in iptables-service to allow only traffic from the other machine.

First, download and install the iptables-services package.

$ sudo yum install iptables-servicesi

Next, start the iptables-service service.

$ sudo systemctl start iptables $ systemctl status iptables iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: active (exited) since Tue 2017-12-05 21:53:41 PST; 55s ago Process: 49042 ExecStart=/usr/libexec/iptables/iptables.init start (code=exited, status=0/SUCCESS) Main PID: 49042 (code=exited, status=0/SUCCESS) Dec 05 21:53:41 knc4-jf-intel-com systemd[1]: Starting IPv4 firewall with iptables... Dec 05 21:53:41 knc4-jf-intel-com iptables.init[49042]: iptables: Applying firewall rules: [ OK ] Dec 05 21:53:41 knc4-jf-intel-com systemd[1]: Started IPv4 firewall with iptables.

The firewall rules are defined in the /etc/sysconfig/iptables file.

$ sudo cat /etc/sysconfig/iptables # sample configuration for iptables service # you can edit this manually or use system-config-firewall # please do not ask us to add additional ports/services to this default configuration *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT

Display the current defined rules; there shouldn’t be any. To add a rule to accept packets from the other machine, specify its IP address.

$ firewall-cmd --direct --get-all-rules $ sudo firewall-cmd --direct --add-rule ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT success $ firewall-cmd --direct --get-all-rules ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT

After adding the new rule, the above command line confirms that the new rule has been added. Run the MPI program again to verify it works.

$ mpirun -host localhost -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv : -host 10.23.3.61 -n 1 /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 #------------------------------------------------------------ # Intel (R) MPI Benchmarks 2018, MPI-1 part #------------------------------------------------------------ # Date : Tue Dec 5 21:58:20 2017 # Machine : x86_64 # System : Linux # Release : 3.10.0-327.el7.x86_64 # Version : #1 SMP Thu Nov 19 22:10:57 UTC 2015 # MPI Version : 3.1 # MPI Thread Environment: # Calling sequence was: # /opt/intel/impi/2018.0.128/bin64/IMB-MPI1 Sendrecv # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # Sendrecv #--------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #--------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 16.49 16.49 16.49 0.00 1 1000 16.40 16.40 16.40 0.12 2 1000 16.40 16.40 16.40 0.24 4 1000 16.86 16.86 16.86 0.47 8 1000 16.43 16.43 16.43 0.97 16 1000 16.32 16.32 16.32 1.96 32 1000 16.64 16.64 16.64 3.85 64 1000 16.90 16.90 16.90 7.57 128 1000 16.86 16.86 16.86 15.18 256 1000 29.58 29.60 29.59 17.30 512 1000 27.73 27.74 27.74 36.91 1024 1000 28.07 28.09 28.08 72.91 2048 1000 34.95 34.97 34.96 117.15 4096 1000 36.22 36.23 36.22 226.12 ^C[mpiexec@knc4] Sending Ctrl-C to processes as requested [mpiexec@knc4] Press Ctrl-C again to force abort

To remove this rule, use the following command.

$ sudo firewall-cmd --direct --remove-rule ipv4 filter INPUT 0 -s 10.23.3.61 -j ACCEPT success $ firewall-cmd --direct --get-all-rules

Conclusion

Firewalls can block MPI communication among the nodes. This article shared three methods you can use to allow communication among MPI ranks.

1