02/24/17 3:50PM Update: All Services have been restored including:
8/24/16 3:57PM: All HPC systems are availalbe including:
- Oakley cluster for general access
- Ruby cluster for restricted access
- Owens cluster for early users
- Home directory and scratch file systems
- OnDemand and other web portals
- Project file system (/fs/project)
All jobs held before downtime have been released by the batch scheduler. If your jobs are still held or you have any questions, please contact email@example.com
15 July 2016, 5:00PM update: some additional issues we are facing
- We are experiencing periodic hangs of the GPFS client file system software used with the new storage environment. We have an open support case with the vendor, but no solution at this time. This may affect access to the /fs/project, and /fs/scratch file systems. Reports of transfer failures to these file systems through scp.osc.edu, and sftp.osc.edu have been reported.
Symlinks transfered from /nfs/gpfs to /fs/project are lost(fixed)
Update: Downtime completed at 6:30PM, June 7th.
The June 7th downtime is now slated to be completed at 6:30PM. Previous estimate was 5PM.
All systems and services will continue to be unavailable until that time.
Thank you for your cooperation.
ARMSTRONG is experiencing an unexpected outage. We are working on a resolution.
We need to reboot all of the login nodes on our production clusters to fix a minor issue from the downtime. We will be conducting this reboot at 7:30AM on Thursday, February 13th 2014. We expect systems to be available again by approximately 7:40AM. Running jobs will not be impacted.