Update Tue Feb 10th 11:30am -- This issue is resolved.  

There is a bug in the changes we made to a part of our batch software during the downtime. The bug is affecting some users when they submit jobs to our system.

Lustre is still offline. HPC systems back up


Day One of the scheduled downtime has been completed, and HPC operations have resumed. As planned, Lustre work will extend into Day Two. Jobs using /fs/lustre or $PFSDIR cannot run until this work is completed, but all other jobs can run.

UPDATE: Performance problems with Lustre have prevented us from bringing up the filesystem. We are working on a resolution.

UPDATE: Lustre returned to service the afternoon of July 12th, 2014.

Alert users: