Rolling reboot of Pitzer cluster, starting from Feb 03, 2021
We will have rolling reboot of Pitzer cluster. User will expect a ~10 minute outage of login nodes at about 9AM Feb 05, 2021.
We will have rolling reboot of Pitzer cluster. User will expect a ~10 minute outage of login nodes at about 9AM Feb 05, 2021.
We have found some types of CP2K jobs would fail or have poor performance using cp2k.popt and cp2k.psmp from MVAPICH2 build (gnu/4.8.5 mvapich2/2.3). This version will be removed on December 15th, 2020. Please switch to Intel MPI build (gnu/7.3.0 intelmpi/2018.3).
A partial-node MPI job may fail to start using mpiexec from intelmpi/2019.3 and intelmpi/2019.7 with error messages like
OSC is currently experiencing problems with its internal network. Interactive sessions may be slow or unresponsive, but running jobs should not be affected.
Users would encounter a MPI job failed with openmpi/3.1.0-hpcx on Owens and Pitzer. The job would stop with the error like "There are not enough slots available in the system to satisfy the slots". Please switch to openmpi/3.1.4-hpcx. The buggy version openmpi/3.1.0-hpcx will be removed on August 18 2020.
==========
Resolved: We removed openmpi/3.1.0-hpcx on August 18 2020.
Users may experience unable to unload Intel software stack via module rm intel after switching between intel and ohter compilers. This is a known issue with current versions of Lmod on Pitzer and Ruby. The issue will be fixed by upgrading Lmod during system-wide downtime May 19th.
The CUDA debugger, cuda-gdb, can raise a segmentation fault immediately upon execution. A workaround before executing cuda-gdb is to unload the xalt module, e.g.:
module unload xalt
This issue affects most cuda modules on Pitzer and Owens.
Users may encoutner an error like 'libim_client.so: undefined reference to `uuid_unparse@UUID_1.0' while compiling MPI applications with mvapich2 in some Conda enivronments. We found pre-installed libuuid package from Conda conflicting with system libuuid libraries. The affected Conda packages are python/2.7-conda5.2, python/3.6-conda5.2 and python/3.7-2019.10.
Users may encounter under-performing MPI jobs or failures of compiling MPI applications if you are using Conda from system. We found pre-installed mpich2 package in some Conda environments overrides default MPI path. The affected Conda packages are python/2.7-conda5.2 and python/3.6-conda5.2. If users experience these issues, please re-load MPI module, e.g. module load mvapich2/2.3.2 after setting up your Conda environment.
A batch job output may contain warnings about excessive memory usage.