Updated on December 7, 2018, at 03:48 PM:
OSC will replace the Ethernet switches in the Owens cluster starting from Dec 14. We do not expect any user-visible impacts from the work. Owens will have slightly reduced capacity on the following dates as we temporarily shut down 2 or 3 racks each day to replace the switches on those racks.
- Friday, Dec. 14
- Monday, Dec. 17
- Friday, Jan. 11
- Tuesday, Jan. 15
Oakley, Ruby and Pitzer clusters are not affected.
Posted on September 14, 2018, at 03:04 PM:
We have been experiencing an issue with the Ethernet switches in the Owens cluster, which may potentially kill the running jobs on Owens. We have been monitoring this issue closely and reserving nodes with the switch errors for emergency maintenance. So far, no running job has been killed due to this issue based on our monitoring. Oakley and Ruby clusters are not affected.
A possible Owens outage will happen for the permanent fix of this issue. We'll provide updates as we learn more from the vendor.
We apologize for any inconvenience this may cause you. Please contact email@example.com if you have any questions.