Users may have been experiencing job failures on Owens cluster since April 16, 2018
The CUDA debugger, cuda-gdb, can raise a segmentation fault immediately upon execution. A workaround before executing cuda-gdb is to unload the xalt module, e.g.:
module unload xalt
This issue affects most cuda modules on Pitzer and Owens.
Users may encoutner an error like 'libim_client.so: undefined reference to `uuid_unparse@UUID_1.0' while compiling MPI applications with mvapich2 in some Conda enivronments. We found pre-installed libuuid package from Conda conflicting with system libuuid libraries. The affected Conda packages are
Users may encounter under-performing MPI jobs or failures of compiling MPI applications if you are using Conda from system. We found pre-installed mpich2 package in some Conda environments overrides default MPI path. The affected Conda packages are
python/3.6-conda5.2. If users experience these issues, please re-load MPI module, e.g.
module load mvapich2/2.3.2 after setting up your Conda environment.
A batch job output may contain warnings about excessive memory usage.
We will have rolling reboots of Owens and Pitzer clusters including login and compute nodes, starting from Monday, February 3, 2020
We have found that large MPI jobs may hang at startup with
mvapich/2.3.1 (on any compiler dependency) due to a known bug that has been fixed in release 2.3.2. If users experience this issue, please switch to
We are having rolling reboots of Owens cluster including login and compute nodes, starting from October 10, 2019.
NCBI blocks any connection from computing nodes because they are behind firewalls. Thus OSC users cannot use SRA tools to download data "on-the-fly" at runtime on computing nodes, e.g. 'fastq-dump -X 5 SRR390728'. OSC users must download SRA data on login using the command 'prefetch' before any sequence analysis. Please see the section 'Download SRA Data' in the SRA Toolkit software page for more detail.
We have found that recent MPI jobs using openmpi/1.10-hpcx and openmpi/2.0-hpcx on Owens may complete or hang until the job is killed, but receive segmentation fault. Some applications might be affected (if you run these applications with openmpi mentioned above): orca, openfoam and lammps. OSC users can use other compatible versions, e.g. openmpi/1.10.7-hpcx and openmpi/2.1.6-hpcx. Please check available version from the software page.