Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolution Description Posted Updated
PyTorch hangs on dual-gpu node on Ascend Ascend, GPU Resolved
(workaround)

PyTorch can hang on Ascend on dual-GPU nodes

Through internal testing, we have confirmed that the hang issue only occurs on Ascend dual-GPU (nextgen) nodes. We’re still unsure why... Read more

2 months 2 days ago 1 month 3 weeks ago
MPI_THREAD_MULTIPLE is not supported with OpenMPI-HPCX 4.x Owens, Software Resolved

A threading code with MPI where MPI_Init_thread uses MPI_THREAD_MULTIPLE will fail because UCX from HPCX package is built without enabling multi-threading. UCX is the... Read more

2 years 4 months ago 2 months 12 hours ago
HCOLL-related failures in OpenMPI applications Cardinal, Software Resolved
(workaround)

Several applications using OpenMPI, including HDF5, Boost, Rmpi, ORCA, and CP2K, may fail with errors such as

mca_coll_hcoll_module_enable() coll_hcol: mca_coll_hcoll_save_coll_handlers... Read more          
8 months 2 weeks ago 2 months 12 hours ago
Handling full-node MPI warnings with MVAPICH 3.0 Ascend, Cardinal Resolved
(workaround)

When running a full-node MPI job with MVAPICH 3.0 , you may encounter the following warning message:

[][mvp_generate_implicit_cpu_mapping] WARNING: You appear to be running at full... Read more          
8 months 2 days ago 2 months 12 hours ago
HWloc warning: Failed with: intersection without inclusion Ascend, Cardinal Resolved
(workaround)

When running MPI+OpenMP hybrid code with the Intel Classic Compiler and MVAPICH 3.0, you may encounter the following warning message from hwloc:

... Read more
8 months 2 days ago 2 months 12 hours ago
Abaqus Parallel Job Failure with PMPI Due to Out-of-Memory (OOM) Error Cardinal Resolved
(workaround)

You may encounter the following error while running an Abaqus parallel job with PMPI:

Traceback (most recent call last):
 File "SMAPylModules/SMAPylDriverPy.m/src/driverAnalysis.py",... Read more          
6 months 1 week ago 2 months 18 hours ago
Ascend desktop including lightweight is not working Resolved

Update: this is fixed. 

Original Post:

Ascend Desktop, including... Read more

3 months 1 day ago 3 months 15 hours ago
Upcoming Expiration of Intel Compiler Licenses on Pitzer and State-wide Licensing Resolved

Old Intel compiler licenses on Pitzer and for state-wide access with versions 19.1.3 and earlier will no longer be available from March 31, 2025. We are currently... Read more

3 months 4 weeks ago 3 months 1 day ago
Core label on OnDemand app is incorrect OnDemand Resolved

The core label on the OnDemand app incorrectly displays as '1', regardless of the requested number of cores for a job. While this label is incorrect, the job is still allocated the correct number... Read more

5 months 3 weeks ago 5 months 3 days ago
Core and Node labels on Classroom app are incorrect Resolved

The core and node labels on the Classroom app (class.osc.edu) incorrectly displays as '0', regardless of the requested number of cores for a job. While this label is incorrect, the job is still... Read more

5 months 3 weeks ago 5 months 3 days ago

Pages