The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

 OSC is preparing to update Slurm on its production systems to version 23.11.4 on March, 27. 

Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Titlesort descending Category Resolution Description Posted Updated
Nsight GPU profiler not working due to DCGM conflict GPU, Infrastructure Resolved

UPDATE (Mar 15, 2023)

After the downtime on Mar. 14, 2023, OSC enabled a new Slurm option --gres=nsight. DCGM will be disabled on the nodes for the job with the Slurm option,... Read more

1 year 3 weeks ago 1 year 2 weeks ago
Nvidia drivers on Oakley GPU Resolved

We upgraded the drivers for the Nvidia GPUs on all of our clusters during the downtime this week. Unfortunately, we are noticing some subtle problems with the GPUs on Oakley. We will be rolling... Read more

7 years 5 months ago 5 years 9 months ago
Oakley and Owens queue issue Batch Resolved

We are experiencing a problem with the queuing system on oakley and owens that is delaying or preventing new jobs from running. Our systems staff is investigating.

 

6 years 3 months ago 6 years 3 months ago
Oakley login node down login Resolved

One of the Oakley login nodes is down. We are currently working on bringing it back online. SSH connections to oakley.osc.edu may time out. A workaround is to connect directly to oakley01.osc.edu... Read more

10 years 2 weeks ago 10 years 2 weeks ago
Oakley login node down Resolved

One of the Oakley login... Read more

9 years 2 months ago 9 years 1 month ago
Oakley login node instability Operations Resolved

Oakley login nodes are seeing some instability related to Lustre. We will reboot the nodes on Thursday, October 2nd 2014 to resolve the issue. If a login node crashes before then and we have the... Read more

9 years 6 months ago 9 years 5 months ago
Oakley Login Node Issues Login Problems Resolved

Currently users connecting via SSH to Oakley may recieve "connection refused" or "connection failed" errors if they were not logged in before this occurred.  Glenn is currently functioning... Read more

9 years 9 months ago 9 years 9 months ago
Oakley login node problems Resolved

One of the Oakley login nodes (oakley01) has experienced some hardware failures and is temporarily out of service while repairs are ongoing.

Please limit your interactive use of the... Read more

9 years 3 months ago 9 years 3 months ago
Oakley login nodes and ruby02 will not be accessible between 9:00-9:30am on 10/18/2016 login Resolved

We upgraded to RHEL 6.8 for both Oakley and Ruby clusters during the October 12th's downtime. Unfortunately, we are noticing some NFS problem that has been causing rsh, or ssh sessions to hang on... Read more

7 years 5 months ago 7 years 5 months ago
Occasional failures in file permissions filesystem Resolved

Users may experience occasional failures in file permissions with our filesystem. We've opened a case with the vendor for further investigations. If you get 'permission denied' message when you... Read more

6 years 1 week ago 2 years 3 months ago

Pages