Decommissioned Supercomputers
OSC has operated a number of supercomputer systems over the years. Here is a list of previous machines and their specifications.
OSC has operated a number of supercomputer systems over the years. Here is a list of previous machines and their specifications.
The July 2014 HPC Tech Talk (Tuesday, July 22nd from 4-5PM) will provide a talk about OSC Roadmap, which includes OSC business model and service catalog, "Condo" pilot project, Ruby cluster, and FY15 capital budget. To get the WebEX information and add a calendar entry, go here. Slides are available below.
The April 2014 HPC Tech Talk (Tuesday, April 22th from 4-5PM) will provide some brief OSC updates, have a user-driven Q&A session, and will close with an invited talk about MPI-3 from the MVAPICH developers from The Ohio State University. To get the WebEX information and add a calendar entry, go here. Slides are available below.
The March 2014 HPC Tech Talk (Tuesday, March 18th from 4-5PM) will provide some brief OSC updates, have a user-driven Q&A session, and will close with a live demonstration of OSC's OnDemand service. You can register for the WebEX session here. Slides are available below.
The February 2014 SUG HPC Tech Talk focused on using the NVIDIA GPUs for computational chemistry. Slides are attached.
Here are the queues available on Glenn. Please note that you will be routed to the appropriate queue based on your walltime and job size request.
Name | Nodes available | max walltime | max job size | notes |
---|---|---|---|---|
Serial |
Available minus reservations |
168 hours |
1 node |
Here are the queues available on Oakley. Please note that you will be routed to the appropriate queue based on your walltime and job size request.
Name | Nodes available | max walltime | max job size | notes |
---|---|---|---|---|
Serial |
Available minus reservations |
168 hours |
1 node |
A common problem on our systems is that a user's job causes a node out of memory or uses more than its allocated memory if the node is shared with other jobs.
If a job exhausts both the physical memory and the swap space on a node, it causes the node to crash. With a parallel job, there may be many nodes that crash. When a node crashes, the OSC staff has to manually reboot and clean up the node. If other jobs were running on the same node, the users have to be notified that their jobs failed.
SSHing directly to a compute node at OSC - even if that node has been assigned to you in a current batch job - and starting VNC is an "unsafe" thing to do. When your batch job ends (and the node is assigned to other users), stray processes will be left behind and negatively impact other users. However, it is possible to use VNC on compute nodes safely.
Ruby was named after the Ohio native actress Ruby Dee. An HP built, Intel® Xeon® processor-based supercomputer, Ruby provided almost the same amount of total computing power (~125 TF, used to be ~144 TF with Intel® Xeon® Phi coprocessors) as our former flagship system Oakley on less than half the number of nodes (240 nodes). Ruby had has 20 nodes are outfitted with NVIDIA® Tesla K40 accelerators (Ruby used to feature two distinct sets of hardware accelerators; 20 nodes were outfitted with NVIDIA® Tesla K40 and another 20 nodes feature Intel® Xeon® Phi coprocessors).