The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

 OSC is preparing to update Slurm on its production systems to version 23.11.4 on March, 27. 

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Title Category Resolution Description Posted Updatedsort ascending
Torque module on Oakley improperly setting environment variables Resolved

Intel library paths are being added to the environment variable LD_LIBRARY_PATH incorrectly when loading torque.  Additionally the Intel paths remain when the torque... Read more

9 years 1 month ago 5 years 9 months ago
pdsh -j broken on Oakley Batch, system software Resolved

pdsh -j is broken on Oakley.  It was broken by updates during the September downtime.  We are currently working on resolving the issue.

Users who require... Read more

8 years 3 months ago 5 years 9 months ago
Performance Regression of GPU Nodes on Ruby GPU, Ruby Resolved

We currently have performance regression of Ruby's GPU nodes. Some of the GPU nodes on Ruby will remain in a power-saving state even after an application starts using them, resulting in... Read more

7 years 3 months ago 5 years 9 months ago
Globus Online Transfers Failing Connectivity, filesystem, Web Services Resolved

We are currently investigating multiple reports of Globus Online transfers to/from OSC to other sites are failing.  Transfers to/from Globus Personal Endpoints do not seem to be affected.

... Read more

7 years 11 months ago 5 years 9 months ago
Rolling reboot of Owens cluster, starting from Monday, April 16, 2018 Owens Resolved

12:00 PM 5/7/2018 Update:

The rolling reboot of Owens has been completed. 

Posted on April 11, 2018, at 3:45... Read more

5 years 11 months ago 5 years 10 months ago
Job failures on some rolling-rebooted nodes on Owens since April 16, 2018 Owens Resolved

3:35 PM 4/30/2018 Update:

The cause is that NFSv4.1 is not configured correctly after OS on Owens was updated from RHEL 7.3 to 7.4. We re-rebooted the Owens compute nodes... Read more

5 years 11 months ago 5 years 11 months ago
Rolling reboots of all clusters starting from Monday Feb 5, 2018 Batch, Owens, Ruby Resolved

Posted on Feb 22 at 1:25PM:

The rolling reboots have been completed. 

Posted on Jan 30, 2018 at 4:00PM:

We will have rolling reboots of... Read more

6 years 1 month ago 6 years 1 month ago
Oakley and Owens queue issue Batch Resolved

We are experiencing a problem with the queuing system on oakley and owens that is delaying or preventing new jobs from running. Our systems staff is investigating.

 

6 years 3 months ago 6 years 3 months ago
Owens batch is down Owens Resolved

Updated at 9:07PM on Dec 20, 2017 :

Owens batch was restored by updating Torque resource manager at 6:37pm Dec 19, 2017. 

Original Post at 4:45PM on Dec 19... Read more

6 years 3 months ago 6 years 3 months ago
Rolling reboot of login nodes of clusters at 7:00AM Dec 19, 2017 login Resolved

We will have rolling reboot of login nodes of clusters at 7:00AM Dec 19, 2017 for GPFS version upgrade. It is supposed to be completed in a short period of time. f you encounter any login issues,... Read more

6 years 3 months ago 6 years 3 months ago

Pages