Possible Owens Outage

Category: 
Resolution: 
Unresolved

Posted on September 14, 2018, at 03:04 PM:

We have been experiencing an issue with the Ethernet switches in the Owens cluster, which may potentially kill the running jobs on Owens. We have been monitoring this issue closely and reserving nodes with the switch errors for emergency maintenance. So far, no running job has been killed due to this issue based on our monitoring. Oakley and Ruby clusters are not affected. 

A possible Owens outage will happen for the permanent fix of this issue. We'll provide updates as we learn more from the vendor. 

We apologize for any inconvenience this may cause you. Please contact oschelp@osc.edu if you have any questions.