Over the past two weeks we have experienced Oakely login node crashes potentially caused by a Lustre bug.

Supercomputers

We currently operate three major systems:

  • Ruby Cluster, an 4800 core Intel Xeon machine
    • 20 nodes have Intel Xeon Phi accelerators
    • 20 nodes have Nvidia Tesla K40 GPUs
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs.
  • Oakley Cluster, an 8,300+ core HP Intel Xeon machineOakley computing cluster
    • One in every 10 nodes has 2 Nvidia Tesla GPU accelerators
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs
  • Glenn Cluster, a 3,500+ core IBM AMD Opteron machine

Our clusters share a common environment, and we have several guides available.

OSC also provides more than 2 PB of storage, and another 2 PB of tape backup.

  • Learn how that space is made available to users, and how to best utilize the resources, in our storage environment guide.

System Notices are available online.

Finally, you can keep up to date with any known issues on our systems (and the available workarounds). An archive of resolved issues can be found here.

Service: