Notify you that one or more of your jobs was running on a compute node that crashed due to a hardware problem.
Failure of job(s) 919137 due to a hardware problem at OSC
OSC Help <OSCHelp@osc.edu>
Your job failed and was not at fault. You should resubmit the job.
These emails are sent by a systems administrator after a node crashes.
We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.
Hardware crashes are quite rare and in most cases there’s nothing you can do to prevent them. Certain types of bus errors on Glenn correlate strongly with certain applications (suggesting that they’re not really hardware errors). If you encounter this type of error you may be advised to use Oakley rather than Glenn.