The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

Parallel job with IntelMPI hangs

Category: 
Resolution: 
Resolved

Some commercial packages, for example Fluent, or StarCCM, with Intel MPI that uses SSH as the default bootstrap mechanism to launch the Hydra process manager experiences startup failures, leading to job hang due to a recent Slurm upgrade. Starting with Slurm version 23.11, the environment variable I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS=--external-launcher is added because Slurm is set as the default bootstrap system (I_MPI_HYDRA_BOOTSTRAP=slurm). However, this causes an issue when SSH is utilized as the bootstrap system.

As a workaround, add export -n I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS before executing the command.