We are experiencing some network performance issues on a cluster of servers involved with providing GPFS and some project filesystems. GPFS appears to be functioning acceptably, but proj01, proj02, proj03, proj08, and proj09 are not. Compute nodes attempting to write to these filesystems will see very slow write speeds.
The root cause has been identified as a damaged fiber optic cable. We will be replacing this cable, and expect an outage of less than one minute to the affected hosts.
UPDATE: The cable has been replaced, and performance has returned to normal.
We will have rolling reboots of Oakley, Ruby and Owens clusters starting from Monday Feb 5, 2018.