We fixed the problem with both project and scratch filesystem and the service has been restored.
We fixed the problem with both project and scratch filesystem.
Updated on 2:30pm Feb 1st:
Scratch filesystem is back. OnDemand is also available now.
Scratch filesystem is down now. Users can not access OSC OnDemand as well. Please use other methods to connect to OSC systems. We are working to fix this issue now and apologize for any inconvenience. Please contact OSC Help if you have any questions.
NFS outage on Thursday, Jan 17 from 7 am to 8 am
OSC is experiencing some problems with the Project and Scratch filesystems that are resulting in some jobs seeing "stale file handles". We are investigating the problem and will provide updates as we have them. Job scheduling is currently paused on all filesystems.
We are working with vendor engineers to resolve the issue, but you can currently still log in via SSH and access your files in your home directories. OnDemand logins will likely hang, and attempts to access files on Project or Scratch will likely fail.
All the services are back.
ACEs of a file in the user's home directory will be lost using
Users may experience occasional failures in file permissions with our filesystem. We've opened a case with the vendor for further investigations. If you get 'permission denied' message when you try to access to the file/directory of which you think you should have the right permission, please contact OSC Help.
3:10PM 4/18/2017 Update: Rolling reboots on Owens have started to address this GPFS issue.
We have had issues with GPFS mounts on Owens Cluster since Friday afternoon, April 14, 2017. The affected nodes have been marked offline to be restarted or rebooted to fix this issue. Jobs may have been negatively impacted by this issue since April 14. If you experience any 'stale file handle' or file not found errors, please let us know.
1:00PM 4/6/2017 Update: The Scratch and Project file systems are back to normal service. Scheduling on systems are resumed. We are still investigating the causes to this problem and will keep you updated when we know more.
The Scratch and Project file systems are currently hung. Schedulings on all three clusters (Owens, Ruby, and Oakley) have been paused for investigating this problem. We will update this page when we know more.