User may have been getting the following error message when trying to submit a PBS job using job arrays
We will have rolling reboots of Oakley, Ruby and Owens clusters starting from Monday Feb 5, 2018.
We are experiencing a problem with the queuing system on oakley and owens that is delaying or preventing new jobs from running. Our systems staff is investigating.
qstat: cannot connect to server oak-batch-test.osc.edu on Oakley between around 3~3:30pm Nov 21, 2017.
Rolling reboot of Owens cluster, starting from 8:30AM Oct 30, 2017
We will have rolling reboots of Oakley and Ruby clusters starting from 8:30AM on Monday October 9, 2017.
We will have a rolling reboot of Owens starting from 9AM on Monday, September 11 2017.
All PBS commands on Owens are working now
3:10PM 4/18/2017 Update: Rolling reboots on Owens have started to address this GPFS issue.
We have had issues with GPFS mounts on Owens Cluster since Friday afternoon, April 14, 2017. The affected nodes have been marked offline to be restarted or rebooted to fix this issue. Jobs may have been negatively impacted by this issue since April 14. If you experience any 'stale file handle' or file not found errors, please let us know.