The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

 OSC is preparing to update Slurm on its production systems to version 23.11.4 on March, 27. 

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolutionsort ascending Description Posted Updated
Rolling reboot of Owens cluster, starting from 9AM June 28, 2017 Owens Resolved

Update posted on July 7, 2017 at 2:00PM:

Rolling reboot of login and compute nodes of Owens cluster is completed. 

... Read more
6 years 8 months ago 6 years 8 months ago
Lustre jobs suspended filesystem Resolved

The Lustre filesystem ($PFSDIR and /fs/lustre) has crashed several times Friday evening (8/15). We have degraded this service temporarily, while we work to isolate the actions that are triggering... Read more

9 years 7 months ago 9 years 6 months ago
ORCA Bind to CORE Failure Software Resolved
(workaround)

The default CPU binding for ORCA jobs can fail sporadically.  The failure is almost immediate and produces a cryptic error message, e.g.:

... Read more          
10 months 3 weeks ago 10 months 3 weeks ago
A reboot of the NetApp as part of an upgrade, starting from Monday, November 19, 2018 Maintenance Resolved

Updated on 12:49 PM Nov... Read more

5 years 4 months ago 5 years 4 months ago
Unable to unload Intel software stack Pitzer, Ruby, Software Resolved

Users may experience unable to unload Intel software stack via module rm intel after switching between intel and ohter compilers. This is a known issue with current versions of Lmod on Pitzer and... Read more

3 years 10 months ago 3 years 10 months ago
Incorrect RU Balances client portal Resolved

RESOLVED 2/20/2019

We deployed a new version of the Client Portal during our downtime on Tuesday, 2/5, and a bug has been introduced.

The Client Portal (my.osc.edu) and OSCUsage... Read more

5 years 1 month ago 5 years 1 week ago
GPFS errors on compute nodes filesystem Resolved

We've seen an increase in transient problems that result in compute nodes losing access to the GPFS file systems for ~5 minutes.

Any jobs running on these nodes accessing files on GPFS may... Read more

3 years 3 months ago 2 years 3 months ago
libibumad.so.2 missing on Oakley Software Resolved

Update:  We think this is fixed.  Please submit a ticket if you encounter further problems.

 

As a result of updates made during yesterday's downtime, software built with mvapich2/... Read more

7 years 5 months ago 7 years 5 months ago
System Downtime 9/29/13 Outage Resolved

OSC systems will be offline on September 29th, 2013 for maintenance. Please visit osc.edu/n for more information.

10 years 5 months ago 10 years 5 months ago
Missing shared library of some mvapich2 modules Owens, Pitzer Resolved

Updates on Feb 25 2022:

This issue is fixed. 

Original Post:

Users may see an issue of missing shared library with some mvapich2 modules... Read more

2 years 3 weeks ago 2 years 3 weeks ago

Pages