We are currently experiencing temporary instability on the Ascend login nodes.

A rolling reboot is in progress to address CVE-2026-23111 for all clusters, including Ascend, Cardinal, and Pitzer.

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolution Description Postedsort descending Updated
Rolling reboots on all HPC systems starting Oct 31 2024 Owens, Pitzer Resolved

Updates on Nov 13 2024:

Pitzer is completed. 

Updates...

Read more
1 year 7 months ago 1 year 7 months ago
HWloc warning: Failed with: intersection without inclusion Ascend, Cardinal Resolved
(workaround)

When running MPI+OpenMP hybrid code with the Intel Classic Compiler and MVAPICH 3.0, you may encounter the following warning message from hwloc:

... Read more
1 year 7 months ago 1 year 1 month ago
Handling full-node MPI warnings with MVAPICH 3.0 Ascend, Cardinal, Pitzer, Software Resolved
(workaround)

When running a full-node MPI job with MVAPICH 3.0 , you may encounter the following warning message:

[][mvp_generate_implicit_cpu_mapping] WARNING: You appear to be...
Read more
1 year 7 months ago 4 months 1 week ago
Addressing CP2K 7.1 Memory Issues on Pitzer and Owens Clusters Owens, Pitzer Resolved
(workaround)

According to https://github.com/cp2k/cp2k/issues/1830 and user feedback, you may encounter Out-of-Memory (OOM) errors during long molecular...

Read more
1 year 6 months ago 1 year 1 month ago
Abaqus Parallel Job Failure with PMPI Due to Out-of-Memory (OOM) Error Cardinal Resolved
(workaround)

You may encounter the following error while running an Abaqus parallel job with PMPI:

Traceback (most recent call last):
 File "SMAPylModules/SMAPylDriverPy.m/src/driverAnalysis.py",...
Read more
1 year 5 months ago 1 year 1 month ago
- --gpus-per-task is not working Batch Resolved

Updated: This is fixed. 

Original Post:

After the recent Slurm upgrade, the option --gpus-per-task is currently not functioning as...

Read more
1 year 5 months ago 1 year 5 months ago
Core label on OnDemand app is incorrect OnDemand Resolved

The core label on the OnDemand app incorrectly displays as '1', regardless of the requested number of cores for a job. While this label is incorrect, the job is still allocated the correct number...

Read more
1 year 5 months ago 1 year 4 months ago
Core and Node labels on Classroom app are incorrect Resolved

The core and node labels on the Classroom app (class.osc.edu) incorrectly displays as '0', regardless of the requested number of cores for a job. While this label is incorrect, the job is still...

Read more
1 year 5 months ago 1 year 4 months ago
Ansys OMP: System error #22: Invalid argument Cardinal Resolved
(workaround)

You may encounter the following error while running Ansys on Cardinal:

OMP: Error #100: Fatal system error detected.
OMP: System error #22: Invalid argument
forrtl: error (76): Abort...
Read more
1 year 5 months ago 1 year 1 month ago
LS-DYNA mpp-dyna Cardinal: Remote access error on mlx5_0:1, RDMA_READ Cardinal, Software Resolved
(workaround)

You may encounter the following error while running mpp-dyna jobs with multiple nodes:

[c0054:22206:0:22206] ib_mlx5_log.c:179  Remote access error on mlx5_0:1/IB (synd 0x13 vend 0x88...
Read more
1 year 4 months ago 12 months 4 days ago

Pages