Rolling reboot of compute and login nodes of all clusters, starting from Wednesday morning, March 22, 2017
4:56PM 3/28/2017 Update: The rolling reboots of all systems are completed.
4:56PM 3/28/2017 Update: The rolling reboots of all systems are completed.
Some MVAPICH2 MPI installations on Oakley, Ruby, and Owens, such as the default module mvapich2/2.2 as well as mvapich2/2.1, appear to have a bug that is triggered by certain programs. The symptoms are 1) the program hangs or 2) the program fails with an error related to Allreduce or Bcast.