Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort ascending Description Posted Updated
PyTorch jobs timeout and hanging GPU Resolved

We have observed that many PyTorch users frequently encounter random timeouts, which result in the termination of their jobs but leave the process running on the node.... Read more

1 year 1 month ago 7 months 3 weeks ago
Negative Balance Emails client portal Resolved

Negative balance emails continue to be sent once an application is submitted.

To confirm whether or not you have truly submitted an application for additional resources and that you can... Read more

5 years 3 months ago 4 years 12 months ago
Spurious warnings about balance being exhausted client portal Resolved

Due to the price changes and some specifics about MyOSC, you may get warnings... Read more

4 years 1 month ago 4 years 4 weeks ago
Performance Regression of GPU Nodes on Ruby GPU, Ruby Resolved

We currently have performance regression of Ruby's GPU nodes. Some of the GPU nodes on Ruby will remain in a power-saving state even after an application starts using them, resulting in... Read more

7 years 9 months ago 6 years 3 months ago
2/13/2014 0730 - Reboot of login nodes Outage Resolved

We need to reboot all of the login nodes on our production clusters to fix a minor issue from the downtime. We will be conducting this reboot at 7:30AM on Thursday, February 13th 2014. We expect... Read more

10 years 6 months ago 10 years 6 months ago
Very little free space for metadata on the scratch storage /fs/scratch filesystem Resolved

Updated 15:30 October 19:

The issue of little space for metadata on scratch storage is resolved. If you have any questions, please contact... Read more

2 years 10 months ago 2 years 10 months ago
Issue with submitting job array Batch, Owens Resolved

3:30 PM 5/10/2018 Original Post:

User may have been getting the following error message when trying to submit a PBS job using job arrays:

qsub: submit error (Maximum number of... Read more          
6 years 3 months ago 2 years 8 months ago
LS-DYNA License problems on all clusters Software Resolved

We are experiencing problems with the LS-DYNA license server. We will resume working on restoring access to this software on October 24th, 2018.

==

It has been fixed and working... Read more

5 years 10 months ago 5 years 10 months ago
Large MPI job startup hang with mvapich2/2.3 and mvapich2/2.3.1 Owens, Pitzer, Software Resolved
(workaround)

We have found that large MPI jobs may hang at startup with mvapich2/2.3 and mvapich/2.3.1 (on any compiler dependency) due to a known bug that has been fixed in release 2... Read more

4 years 9 months ago 2 years 4 months ago
Estimated charging for serial jobs on Oakley is incorrect Batch Resolved

Currently, the estimated RU charge reported at the end of a job shows an incorrect value for serial jobs on Oakley of the entire node. Jobs are being charged the correct amount in the official... Read more

8 years 11 months ago 6 years 2 months ago

Pages