Update Tue Feb 10th 11:30am -- This issue is resolved.  

There is a bug in the changes we made to a part of our batch software during the downtime. The bug is affecting some users when they submit jobs to our system.

Lustre is still offline. HPC systems back up

Category: 
Resolution: 
Resolved

Day One of the scheduled downtime has been completed, and HPC operations have resumed. As planned, Lustre work will extend into Day Two. Jobs using /fs/lustre or $PFSDIR cannot run until this work is completed, but all other jobs can run.

UPDATE: Performance problems with Lustre have prevented us from bringing up the filesystem. We are working on a resolution.

UPDATE: Lustre returned to service the afternoon of July 12th, 2014.

Alert users: 
display