Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort descending Description Posted Updated
A partial-node MPI job failed to start using Intel MPI mpiexec Owens, Pitzer, Software Resolved
(workaround)

A partial-node MPI job may fail to start using mpiexec from intelmpi/2019.3 and intelmpi/2019.7 with error messages like

[mpiexec@o0439.ten.osc.... Read more          
5 years 3 months ago 9 months 1 day ago
MVAPICH2 build of CP2K 6.1 Pitzer Resolved

We have found some types of CP2K jobs would fail or have poor performance using cp2k.popt and cp2k.psmp from MVAPICH2 build (gnu/4.8.5 mvapich2/2.3). This version will be removed on December 15th... Read more

5 years 2 months ago 4 years 11 months ago
XDMOD outage: 9AM-4PM June 2 2021 Outage Resolved

There is a scheduled outage for XDMOD tool (xdmod.osc.edu) between 9AM-4PM June 2 2021 for upgrading to 9.5.0. During the outage XDMOD will be in maintain mode and not accessible by OSC users. ... Read more

4 years 8 months ago 4 years 8 months ago
ORCA Bind to CORE Failure Software Resolved
(workaround)

The default CPU binding for ORCA jobs can fail sporadically.  The failure is almost immediate and produces a cryptic error message, e.g.:

... Read more          
2 years 9 months ago 9 months 1 day ago
PyTorch hangs on dual-gpu node on Ascend Ascend, GPU Resolved
(workaround)

PyTorch can hang on Ascend on dual-GPU nodes

Through internal testing, we have confirmed that the hang issue only occurs on Ascend dual-GPU (nextgen) nodes. We’re still unsure why... Read more

9 months 2 weeks ago 9 months 1 week ago
STAR error bgzf_open: Assertion failed Cardinal, Software Resolved
(workaround)

You may encounter errors that look similar to these when running STAR 2.7.10b:

STAR: bgzf.c:158: bgzf_open: Assertion `compressBound(0xff00) < 0x10000' failed.

Cause... Read more

8 months 4 weeks ago 8 months 2 weeks ago
Instability on Clusters after May 13 Downtime Resolved

We've been experiencing some instability on the clusters (particularly Cardinal and Ascend) following the recent May 13 downtime, especially with parallel job processing. If you notice any unusual... Read more

8 months 4 weeks ago 8 months 1 week ago
Python version mismatch in Jupyter + Spark instance Software Resolved
(workaround)

You may encounter the following error message when running a Spark instance using a custom kernel in the Jupyter + Spark app:

25/04/25 10:49:01 WARN TaskSetManager:... Read more          
8 months 4 weeks ago 2 months 11 hours ago
NCCL hang on Ascend dual-GPU nodes Ascend, GPU, Software Resolved
(workaround)

Users may encounter the following message and experience NCCL hangs if the first operation is a barrier when running multi-GPU training. We have identified... Read more

8 months 2 weeks ago 8 months 1 week ago
Resolved: Home directory space Issue with MATLAB 2024a Software Resolved

Users may experience their home directory running out of space after executing multiple MATLAB 2024a jobs. This issue is caused by the accumulation of multiple copies of the MathWorks Service... Read more

8 months 1 week ago 8 months 1 week ago

Pages