Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort ascending Description Posted Updated
NCCL hang on Ascend dual-GPU nodes Ascend, GPU, Software Resolved
(workaround)

Users may encounter the following message and experience NCCL hangs if the first operation is a barrier when running multi-GPU training. We have identified... Read more

1 month 2 weeks ago 1 month 1 week ago
Nvidia drivers on Oakley GPU Resolved

We upgraded the drivers for the Nvidia GPUs on all of our clusters during the downtime this week. Unfortunately, we are noticing some subtle problems with the GPUs on Oakley. We will be rolling... Read more

8 years 9 months ago 7 years 1 month ago
openmpi/4.1.1 is deprecated Resolved

openmpi/4.1.1-hpcx will be removed on November 29th, 2022 due to InfiniBand drivers (MOFED) update. Please use compatible and bug-fixed version 'openmpi/4.1.2-hpcx' to run ORCA or your MPI... Read more

2 years 7 months ago 2 years 7 months ago
Brief disruption to external network, 2013/11/27 Connectivity Resolved

This maintenence was cancelled, to be rescheduled at some undetermined point in the future.

Between 12:01AM and 2:00AM Eastern on Wednesday, November 27th 2013, OARnet will be... Read more

11 years 8 months ago 11 years 7 months ago
Occasional failures in file permissions filesystem Resolved

Users may experience occasional failures in file permissions with our filesystem. We've opened a case with the vendor for further investigations. If you get 'permission denied' message when you... Read more

7 years 3 months ago 3 years 7 months ago
OSC OnDemand is not responsive OnDemand Resolved

OSC OnDemand is not responsive now. We are investigating the problem now. Please use other ways like ssh to connect to OSC HPC systems. 

We apologize for any inconvenience this may cause... Read more

5 years 4 months ago 5 years 4 months ago
Core and Node labels on Classroom app are incorrect Resolved

The core and node labels on the Classroom app (class.osc.edu) incorrectly displays as '0', regardless of the requested number of cores for a job. While this label is incorrect, the job is still... Read more

6 months 3 days ago 5 months 2 weeks ago
Unscheduled GPFS Outage filesystem Resolved

As of 11:30PM on June 16th, we have removed the GPFS filesystem from service due to a number of hardware failures. At this point, further hardware failures would put a large portion of the entire... Read more

10 years 3 weeks ago 10 years 3 weeks ago
Group membership discrepancies Account Management, client portal Resolved

Group changes may not always propagate through to our HPC Systems, although they show in the Client Portal (my.osc.edu). 

Issue: if you are added to a project that is still in a REQUESTED... Read more

6 years 3 days ago 5 years 10 months ago
Performance degradation of ESS filesystem filesystem Resolved

Updated 2:10PM November 25, 2021:

The... Read more

3 years 7 months ago 3 years 7 months ago

Pages