Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort descending Description Posted Updated
Oakley and Owens queue issue Batch Resolved

We are experiencing a problem with the queuing system on oakley and owens that is delaying or preventing new jobs from running. Our systems staff is investigating.

 

8 years 4 months ago 8 years 4 months ago
HCOLL-related failures in OpenMPI applications Cardinal, Software Resolved
(workaround)

Several applications using OpenMPI, including HDF5, Boost, Rmpi, ORCA, and CP2K, may fail with errors such as

mca_coll_hcoll_module_enable() coll_hcol: mca_coll_hcoll_save_coll_handlers... Read more          
1 year 7 months ago 1 year 1 week ago
Oakley login node problems Resolved

One of the Oakley login nodes (oakley01) has experienced some hardware failures and is temporarily out of service while repairs are ongoing.

Please limit your interactive use of the... Read more

11 years 5 months ago 11 years 5 months ago
module spider/avail/show not showing MPI dependent modules Ruby Resolved

On Ruby, the commands:

  • module spider
  • module avail
  • module show... Read more
11 years 2 weeks ago 10 years 7 months ago
ondemand gpu request error Nov 2021 Batch, OnDemand, Pitzer Resolved

When requesting an interactive session in ondemand and requesting gpu resources, users may see an error similar similar to  "sbatch: error: Invalid generic resource (gres) specification"

... Read more

4 years 5 months ago 4 years 5 months ago
Data on /fs/scratch is not accessible filesystem Resolved

Updated on 10:30 AM July 3rd, 2019:

Data on /fs/scratch is accessible now. We are working with the vendor to find the root cause and apologize for any inconvenience.  ... Read more

6 years 10 months ago 6 years 10 months ago
MPI fails with UCX 1.18 Software Resolved
(workaround)

After the downtime on August 19, 2025, users may encounter UCX errors such as:

UCX ERROR no active messages transport to <no debug data>: self/memory -... Read more          
8 months 3 weeks ago 7 months 1 week ago
Critical change about using $PFSDIR directory at OSC Batch Resolved

Starting from Thursday, Feb 2nd, the $PFSDIR directory on scratch (/fs/scratch) won’t be created by job prologue. For example, if you simply use the command cd $PFSDIR,... Read more

9 years 3 months ago 9 years 3 months ago
MOE license server down Licensing Resolved

The MOE license server is experiencing an unknown issue and potentially down.  We are working to resolve the issue.

2 years 7 months ago 2 years 7 months ago
OnDemand with Safari Web Services Resolved

Currently, if you have popups blocked in Safari, some services in OSC OnDemand (most notably the HPC terminal) will silently fail to work. We are working on a solution, as well as a workaround for... Read more

12 years 3 weeks ago 10 years 6 months ago

Pages