Known Issues

Title Cat. Res. Description Postsort ascending Upd.
Ruby is offline Operations Resolved

The Ruby Transitional Cluster (only open to select research groups) is currently offline due to network problems. We expect it will return to service some time after the downtime.

6 months 4 weeks ago 2 months 1 week
8AM 9/11/13 - Brief network disruption to reboot a switch Network Resolved

At 8AM on September 11, 2013, we will be rebooting a network switch to replace a failed card in the switch. Network will be disrupted for 10 to 15 minutes while the work is done. Filesystem mounts... (Read more)

7 months 2 weeks ago 6 months 4 weeks
Brief disruption of GPFS on 8/28/2013 filesystem Resolved

On the morning August 28th, 2013 we will briefly disrupt the GPFS filesystem to reboot servers. This is necessary to upgrade the GPFS system. The in-place upgrade should only briefly interrupt... (Read more)

7 months 4 weeks ago 7 months 4 weeks
Brief disruption on 8/1/2013 at 8AM Network Resolved

At 8AM on the morning of 8/1/2013, we will be replacing some faulty hardware in our network infrastructure. Unfortunately, this work cannot be delayed until the next downtime, and the replacement... (Read more)

8 months 3 weeks ago 8 months 2 weeks
Poor network performance on some filesystems filesystem Resolved

We are experiencing some network performance issues on a cluster of servers involved with providing GPFS and some project filesystems. GPFS appears to be functioning acceptably, but proj01, proj02... (Read more)

9 months 3 days ago 9 months 3 days
Network card re-seat Network Resolved

At 8AM on Tuesday, July 9th 2013, we will be re-seating a network card in a switch at our operations center. It is possible that a brief (~10 minute) outage may occur. Jobs will pause for the... (Read more)

9 months 2 weeks ago 9 months 2 weeks
6/4/13 Scheduled Downtime Outage Resolved

HPC systems are currently offline for scheduled maintenance. See osc.edu/n for more information.

10 months 3 weeks ago 10 months 3 weeks
Brief interruption of services for some users filesystem Resolved

Today, May 14 2013, at 12:45PM we will be temporarily removing one of the home directory servers from service to address some reliability issues. Users with home... (Read more)

11 months 2 weeks ago 11 months 2 weeks
Backups of /nfs/gpfs Backups Resolved

Changes to files on /nfs/gpfs may not be backed up during the following evening's backup, as would normally be expected. The backup software is attempting to recreate a full backup... (Read more)

1 year 12 hours ago 11 months 3 weeks
Brief interruption of batch services on 4/17 Batch Resolved

On April 17th 2013, at roughly 2PM, we will be rebooting the batch server on the Oakley cluster. Running jobs will not be affected, but there will be a brief disruption in scheduling, as well as... (Read more)

1 year 6 days ago 1 year 6 days

Pages