Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort ascending Description Posted Updated
OSC OnDemand is not responsive OnDemand Resolved

OSC OnDemand is not responsive now. We are investigating the problem now. Please use other ways like ssh to connect to OSC HPC systems. 

We apologize for any inconvenience this may cause... Read more

5 years 10 months ago 5 years 10 months ago
Scratch filesystem is down filesystem, OnDemand Resolved

Updated on 2:30pm Feb 1st:

Scratch filesystem is back. OnDemand is also available now. 

Original Post:

Scratch filesystem is down now.... Read more

6 years 11 months ago 6 years 11 months ago
MVAPICH2 and/or STAR-CCM+ MPI job failure and workaround Cardinal, Software Resolved
(workaround)

... Read more

1 year 3 months ago 6 months 3 weeks ago
GPFS hang Issue on 09/08/2016 filesystem Resolved

On Thursday, Sept 8 starting at 19:37, we had some bad interaction that appears to have been caused by the backup client, and the GPFS servers. This resulted in a GPFS hang that propagated I/O... Read more

9 years 4 months ago 9 years 4 months ago
CP2K 6.1 would fail on Pitzer Cascade Lakes (48-core) node: Pitzer Resolved
(workaround)

CP2K 6.1 would fail with the following error when running on Pitzer Cascade Lakes (48-core) node:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic... Read more          
4 years 6 months ago 8 months 1 week ago
NCCL hang on Ascend dual-GPU nodes Ascend, GPU, Software Resolved
(workaround)

Users may encounter the following message and experience NCCL hangs if the first operation is a barrier when running multi-GPU training. We have identified... Read more

7 months 3 weeks ago 7 months 2 weeks ago
8AM 9/11/13 - Brief network disruption to reboot a switch Network Resolved

At 8AM on September 11, 2013, we will be rebooting a network switch to replace a failed card in the switch. Network will be disrupted for 10 to 15 minutes while the work is done. Filesystem mounts... Read more

12 years 4 months ago 12 years 3 months ago
Owens batch is down Owens Resolved

Updated at 9:07PM on Dec 20, 2017 :

Owens batch was restored by updating Torque resource manager at 6:37pm Dec 19, 2017. 

Original Post at 4:45PM on Dec 19... Read more

8 years 1 month ago 8 years 1 month ago
openmpi/4.1.1 is deprecated Resolved

openmpi/4.1.1-hpcx will be removed on November 29th, 2022 due to InfiniBand drivers (MOFED) update. Please use compatible and bug-fixed version 'openmpi/4.1.2-hpcx' to run ORCA or your MPI... Read more

3 years 1 month ago 3 years 1 month ago
MVAPICH 3.0 hang due to PMI mismatch with Slurm Software Resolved
(workaround)

Applications such as Quantum ESPRESSO, LAMMPS, and NWChem experienced hangs with MVAPICH 3.0 due to a PMI mismatch. MVAPICH 3.0 was built with PMI-1, while newer Slurm versions on RHEL 9... Read more

3 months 1 week ago 1 month 2 days ago

Pages