The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

 OSC is preparing to update Slurm on its production systems to version 23.11.4 on March, 27. 

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Titlesort descending Category Resolution Description Posted Updated
Rolling reboots on owens and pitzer starting 18 Aug 2021 Batch, Connectivity, Maintenance Resolved

We will have rolling reboots of Owens and Pitzer cluster, including login and compute nodes, starting from 9am on August 18, 2021. The rolling reboot is for urgent security updates.

The... Read more

2 years 7 months ago 2 years 6 months ago
Ruby is offline Operations Resolved

The Ruby Transitional Cluster (only open to select research groups) is currently offline due to network problems. We expect it will return to service some time after the downtime.

10 years 6 months ago 10 years 1 month ago
Ruby Rolling Reboot Resolved

2015/02/16 RUBY Rolling Reboot starting Today

 

A rolling reboot is required on Ruby to update a critical... Read more

9 years 1 month ago 9 years 1 month ago
Running jobs requeued on all clusters Owens, Pitzer Resolved

The Slurm upgrades during rolling reboots of Ascend, Owens and Pitzer we performed today (Oct 25 2023) cause all running jobs on the systems requeued around 8:45am. You will not be billed for the... Read more

5 months 5 days ago 5 months 5 days ago
Scheduling suspended Batch Resolved

We have temporarily suspended scheduling due to some problems with the parallel scratch file system.

9 years 6 months ago 9 years 6 months ago
Scheduling temporarily suspended on Oakley Batch Resolved

We are migrating the batch scheduler on Oakley to a new virtual machine. In order to accomplish this, the scheduler will be temporarily offline for about four hours on December 16th. Running jobs... Read more

8 years 3 months ago 8 years 3 months ago
Scratch and Project are hung; schedulings have been paused Batch, filesystem Resolved

1:00PM 4/6/2017 Update:  The Scratch and Project file systems are back to normal service. Scheduling on systems are resumed. We are still investigating the causes to this problem... Read more

6 years 11 months ago 6 years 11 months ago
Scratch filesystem Errors on October 1st, 2019 filesystem Resolved

Updated on 12:40 PM Oct 1st, 2019: 

OSC had experienced an unexpected problem with the Scratch filesystem that gives errors "No space left on device" on all clusters since... Read more

4 years 6 months ago 4 years 6 months ago
Scratch filesystem is down filesystem, OnDemand Resolved

Updated on 2:30pm Feb 1st:

Scratch filesystem is back. OnDemand is also available now. 

Original Post:

Scratch filesystem is down now.... Read more

5 years 1 month ago 5 years 1 month ago
Security Vulnerability for GPFS filesystem Resolved

Update: The fix was deployed during May 19 Downtime. 

Clients are not able to use mm* commands to manipulate GPFS ACLs on most OSC systems, due to a security vulnerability... Read more

3 years 11 months ago 3 years 10 months ago

Pages