Update Tue Feb 10th 11:30am -- This issue is resolved.  

There is a bug in the changes we made to a part of our batch software during the downtime. The bug is affecting some users when they submit jobs to our system.

Lustre, Infiniband Operational and Being Monitored Closely


UPDATE: Most users should no longer see any issues with Lustre.

Again, please continue to notify OSC Help of any errors you see in job output. For example, you might see "IBV_EVENT_PORT_ERR" in your job output. Notifying the helpdesk quickly will help the Operations staff to reduce the effects of any issues.

We apologize for the disruption. We work hard to avoid these incidents, but sometimes they do happen. We appreciate your patience.

Alert users: