Updated on 12:40 PM Oct 1st, 2019:
OSC had experienced an unexpected problem with the Scratch filesystem that gives errors "No space left on device" on all clusters since yesterday, September 30. The outage was caused by the rapid exhaustion of metadata storage space, and the reason for the spike in metadata space utilization is unknown. The scratch filesystem has been returned to service after we worked with the vendor to implement a temporary workaround.
We are working with the vendor to find the root cause and a permanent fix for the underlying problem. We apologize for any inconvenience this may cause. We will provide updates if there is a risk of further outages. Please contact oschelp@osc.edu if you have any questions.
Original Post:
We have become aware of a problem with the Scratch filesystem that gives errors "No space left on device" since around 3PM yesterday, September 30. The cause is that the metadata subsystem is pretty full. This issue results in the failures of users' jobs.
We have opened a ticket with vendor and are working hard to fix this issue as soon as possible. We really apologize for any inconvenience this may cause you and will keep you posted. Please contact oschelp@osc.edu if you have any questions.