Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort ascending Description Posted Updated
PyTorch jobs timeout and hanging GPU Resolved

We have observed that many PyTorch users frequently encounter random timeouts, which result in the termination of their jobs but leave the process running on the node.... Read more

1 year 2 months ago 8 months 1 week ago
A bug in the trigger that sends automated emails from client portal client portal Resolved

We deployed a new version to OSC Client Portal (my.osc.edu) at 3 pm Tuesday, July 9th, which involves a bug in the trigger that sends automated emails to some OSC clients with the subject 'Your... Read more

5 years 2 months ago 5 years 2 months ago
Spurious warnings about balance being exhausted client portal Resolved

Due to the price changes and some specifics about MyOSC, you may get warnings... Read more

4 years 2 months ago 4 years 1 month ago
Rolling reboot of compute and login nodes of all clusters, starting from Wednesday morning, March 22, 2017 login, Owens, Ruby Resolved

4:56PM 3/28/2017 Update: The rolling reboots of all systems are completed. 

All compute nodes and login nodes of Owens, Oakley, and Ruby clusters will need to be rebooted... Read more

7 years 6 months ago 7 years 5 months ago
OnDemand has NOT been working with external providers since 08/22 OnDemand Resolved

Updates on 9:40AM August 23, 2017: this issue has been resolved. 

>>>

Issue:

User can NOT login to OnDemand (ondemand.osc.edu)... Read more

7 years 3 weeks ago 7 years 2 weeks ago
Lustre Updates filesystem Resolved

9/10/14 - We have not seen any additional crashes of the Lustre servers since making this change.

8/26/14 
- Lustre jobs are being accepted as of 10AM this... Read more

10 years 2 weeks ago 10 years 3 days ago
ondemand outage OnDemand Resolved

Resolution notes

The problems with ondemand.osc.edu are now resolved.

Users will encounter errors using... Read more

2 years 4 months ago 2 years 4 months ago
Stale File Handles on GPFS clients filesystem Resolved

OSC is experiencing some problems with the Project and Scratch filesystems that are resulting in some jobs seeing "stale file handles". We are investigating the problem and will provide updates as... Read more

5 years 8 months ago 5 years 8 months ago
Large MPI job startup hang with mvapich2/2.3 and mvapich2/2.3.1 Owens, Pitzer, Software Resolved
(workaround)

We have found that large MPI jobs may hang at startup with mvapich2/2.3 and mvapich/2.3.1 (on any compiler dependency) due to a known bug that has been fixed in release 2... Read more

4 years 10 months ago 2 years 4 months ago
Problems with MVAPICH2 Owens, Ruby, Software Resolved

Some MVAPICH2 MPI installations on Oakley, Ruby, and Owens, such as the default module mvapich2/2.2 as well as mvapich2/2.1, appear to have a bug that is triggered by certain programs.  The... Read more

8 years 7 months ago 2 years 4 months ago

Pages