Rolling reboot of Ascend, Owens and Pitzer starting from Oct 25 2023

Update on Nov 8 2023:

Rolling reboots of all clusters are completed. 

Update on Nov 3 2023:

Rolling reboots of Ascend and Pitzer clusters are completed. 

Original Post:

We will have rolling reboots of Ascend, Owens and Pitzer clusters including login and compute nodes, starting from 9AM Wednesday October 25, to perform NVIDIA driver and Slurm upgrades.

Missing shared library of some mvapich2 modules

Updates on Feb 25 2022:

This issue is fixed. 

Original Post:

Users may see an issue of missing shared library with some mvapich2 modules on Pitzer and Owens. The error is like

<path_to_executable>: error while loading shared libraries: libim_client.so.0: cannot open shared object file: No such file or directory

We are in the process of rebuilding mvapich2 versions that are affected. 

CP2K 6.1 would fail on Pitzer Cascade Lakes (48-core) node:

CP2K 6.1 would fail with the following error when running on Pitzer Cascade Lakes (48-core) node:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:

Thid could be a bug in libxsmm 1.9.0 which is released on Mar 15, 2018 (Cascade Lake is launched in 2019). The bug has been fixed in cp2k/7.1.