Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolution Description Posted Updated
CP package hang with MVAPICH3 in Quantum-Espresso Software Resolved
(workaround)

While using MVAPICH3 builds of Quantum ESPRESSO (QE), users may encounter hangs when running the CP package, which can lead to job timeouts. We recommend switching to the OpenMPI build of any QE... Read more

1 week 1 day ago 1 day 19 hours ago
MVAPICH2 and/or STAR-CCM+ MPI job failure and workaround Cardinal, Software Resolved
(workaround)

... Read more

8 months 3 weeks ago 6 days 15 hours ago
OneDrive Connector File Transfer Issue with Globus Resolved

Update:

We deployed the fix from Globus: https://docs.globus.org/globus-connect-server/... Read more

3 weeks 2 days ago 2 weeks 1 day ago
OpenMPI 4 and NVHPC MPI Compatibility Issues with SLURM HWLOC Ascend, Cardinal, Software Resolved
(workaround)

A pure MPI application using mpirun or mpiexec with more ranks than the number of NUMA nodes may encounter an error similar to the following:... Read more

3 months 2 weeks ago 2 weeks 2 days ago
Cannot use mpiexec/mpirun from OpenMPI in an interactive session Owens, Pitzer, Software Resolved
(workaround)

We found mpiexec/mpirun from OpenMPI can not be used in an interactive session (launched by sinteractive) after upgrading Pitzer and... Read more

4 years 3 months ago 2 weeks 2 days ago
LS-DYNA mpp-dyna Cardinal: Remote access error on mlx5_0:1, RDMA_READ Cardinal, Software Resolved
(workaround)

You may encounter the following error while running mpp-dyna jobs with multiple nodes:

[c0054:22206:0:22206] ib_mlx5_log.c:179  Remote access error on mlx5_0:1/IB (synd 0x13 vend 0x88... Read more          
5 months 1 week ago 2 weeks 2 days ago
OSC Service Outage Notification Outage Resolved

We are currently experiencing outages affecting multiple services, including OnDemand (ondemand.osc.edu) and login nodes of HPC systems. Our team is actively investigating and working to resolve... Read more

3 weeks 13 hours ago 2 weeks 6 days ago
Instability on Clusters after May 13 Downtime Resolved

We've been experiencing some instability on the clusters (particularly Cardinal and Ascend) following the recent May 13 downtime, especially with parallel job processing. If you notice any unusual... Read more

1 month 2 weeks ago 1 month 18 hours ago
Resolved: Home directory space Issue with MATLAB 2024a Software Resolved

Users may experience their home directory running out of space after executing multiple MATLAB 2024a jobs. This issue is caused by the accumulation of multiple copies of the MathWorks Service... Read more

1 month 19 hours ago 1 month 19 hours ago
NCCL hang on Ascend dual-GPU nodes Ascend, GPU, Software Resolved
(workaround)

Users may encounter the following message and experience NCCL hangs if the first operation is a barrier when running multi-GPU training. We have identified... Read more

1 month 1 week ago 1 month 1 day ago

Pages