OSC experienced errors with NFS services the morning of June 29 between 08:37 and 09:12 that may have caused

Known Issues

Title Cat. Res. Description Post Upd.sort ascending
Lustre jobs suspended filesystem Resolved

The Lustre filesystem ($PFSDIR and /fs/lustre) has crashed several times Friday evening (8/15). We have degraded this service temporarily, while we work to isolate the actions that are triggering... (Read more)

1 year 10 months ago 1 year 10 months
Armstrong offline until Noon Armstrong Resolved

Armstrong will need to be taken down today until Noon.  In the meantime, contact OSCHelp (OSCHelp@osc.edu) for account assistance.

1 year 10 months ago 1 year 10 months
Lustre, Infiniband Operational and Being Monitored Closely filesystem Resolved

UPDATE: Most users should no longer see any issues with Lustre.


Again, please continue to notify OSC Help of any errors you see in job output. For example, you might see "... (Read more)

1 year 11 months ago 1 year 10 months
issue with OnDemand 6:09 - 8:39 pm Resolved

OnDemand, epi accounting queries, the Viper DB, the Medline DB, the Eweld DB,... (Read more)

1 year 10 months ago 1 year 10 months
my.osc.edu logins failing Account Management Resolved

Logins to my.osc.edu are failing. This is unrelated to our InfiniBand issue; a router change at OARnet is the believed cause. They are working on re-establishing the necessary routing.

1 year 11 months ago 1 year 11 months
Emergency InfiniBand Shutdown (All systems) Network Resolved

We have returned to service. It appears that we have resolved the networking issues enough to allow jobs to run safely. We will continue working with our vendors to fix any remaining hardware... (Read more)

1 year 11 months ago 1 year 11 months
Lustre is still offline. HPC systems back up Maintenance Resolved

Day One of the scheduled downtime has been completed, and HPC operations have resumed. As planned, Lustre work will extend into Day Two. Jobs using /fs/lustre or $PFSDIR cannot run until this work... (Read more)

1 year 11 months ago 1 year 11 months
Certain modules not accessible Software Resolved

Certain modules are not working for all clusters since the downtime.  We have reports specifically that Amber, Gaussian, and Turbomole are not working.  We are working to resolve the issue, but... (Read more)

1 year 11 months ago 1 year 11 months
Login Shell Issues on Oakley Account/Shell Resolved

UPDATE: The shells have all been switched back for affected users, and you can submit jobs normally again.  Additionally, if you are still logged in and have the incorrect shell, logging back out... (Read more)

2 years 1 week ago 2 years 1 week
Account changes temporarily suspended Account Management Resolved

We are still experiencing some account problems related to Thursday's issue. As a result, we have taken my.osc.edu offline and cannot process email changes or password resets, either via self-... (Read more)

2 years 2 weeks ago 2 years 1 week

Pages