We've been experiencing some instability on the clusters (particularly Cardinal and Ascend). 

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort descending Description Posted Updated
Oakley login node instability Operations Resolved

Oakley login nodes are seeing some instability related to Lustre. We will reboot the nodes on Thursday, October 2nd 2014 to resolve the issue. If a login node crashes before then and we have the... Read more

10 years 7 months ago 10 years 6 months ago
myosc outage - may 17, 2021 client portal Resolved

Resolution: Access to myosc was restored.

myosc is currently unavailable.

... Read more
3 years 12 months ago 3 years 12 months ago
Rolling reboots of Owens and Pitzer, starting from Tuesday, Jan 22, 2019 Batch, login, Owens Resolved

... Read more

6 years 4 months ago 6 years 3 months ago
Singularity: failed to run a container directly or pull an image from Singularity or Docker hub Software Resolved
(workaround)

You might encounter an error while run a container directly from a hub:

[pitzer-login01]$ apptainer run shub://vsoch/hello-world
Progress |===================================| 100.0%... Read more          
2 days 21 hours ago 2 days 21 hours ago
Globus Online Transfers Failing Connectivity, filesystem, Web Services Resolved

We are currently investigating multiple reports of Globus Online transfers to/from OSC to other sites are failing.  Transfers to/from Globus Personal Endpoints do not seem to be affected.

... Read more

9 years 1 month ago 6 years 11 months ago
Possible performance degradation after August 9th's downtime filesystem Resolved

Updates on May 20 2023:

verbsRDMA is enabled on Pitzer. 

Updates on Dec 14 2022:

verbsRDMA is enabled on Owens during December 13 downtime... Read more

2 years 9 months ago 1 year 11 months ago
Network card re-seat Network Resolved

At 8AM on Tuesday, July 9th 2013, we will be re-seating a network card in a switch at our operations center. It is possible that a brief (~10 minute) outage may occur. Jobs will pause for the... Read more

11 years 10 months ago 11 years 10 months ago
GPFS problems on Owens filesystem Resolved

Owens is experiencing a disruption of GPFS availability. At about 4:17PM today (January 6th), OSC monitoring noticed a problem with mounts of Project on the Owens supercomputer. Jobs may have been... Read more

5 years 4 months ago 5 years 4 months ago
Rolling reboot of Oakley and Ruby clusters, starting from 8:30AM October 9, 2017 Batch, login, Ruby Resolved

Updates on 1:00PM October 16, 2017: 

The rolling reboots of Oakley and Ruby are completed. 

... Read more
7 years 7 months ago 7 years 7 months ago
Schrodinger license check in issue Licensing Resolved

Schrödinger is an application that uses a FlexNet license server. To run a job, the application needs to check out licenses from the server and check it back in once the job is completed. However... Read more

10 months 2 weeks ago 5 months 1 week ago

Pages