We will have a rolling reboot of login and compute nodes of Owens cluster starting from Monday, April 16, 2018.

Users may have been experiencing job failures on Owens cluster since April 16, 2018

Poor network performance on some filesystems


We are experiencing some network performance issues on a cluster of servers involved with providing GPFS and some project filesystems. GPFS appears to be functioning acceptably, but proj01, proj02, proj03, proj08, and proj09 are not. Compute nodes attempting to write to these filesystems will see very slow write speeds.

The root cause has been identified as a damaged fiber optic cable. We will be replacing this cable, and expect an outage of less than one minute to the affected hosts.

UPDATE: The cable has been replaced, and performance has returned to normal.