TIP: Remember to check the menu to the right of the page for related pages with more information about Oakley's specifics.

Oakley is an HP-built, Intel® Xeon® processor-based supercomputer, featuring more cores (8,328) on half as many nodes (694) as the center’s former flagshipsystem, the IBM Opteron 1350 Glenn Cluster. The Oakley Cluster can achieve 88 teraflops, tech-speak for performing 88 trillion floating point operations per second, or, with acceleration from 128 NVIDIA® Tesla graphic processing units (GPUs), a total peak performance of just over 154 teraflops.



Photo: OSC Oakley HP Intel Xeon ClusterDetailed system specifications:

  • 8,328 total cores
    • 12 cores/node  & 48 gigabytes of memory/node
  • Intel Xeon x5650 CPUs
  • HP SL390 G7 Nodes
  • 128 NVIDIA Tesla M2070 GPUs
  • 873 GB of local disk space in '/tmp'
  • QDR IB Interconnect
    • Low latency
    • High throughput
    • High quality-of-service.
  • Theoretical system peak performance
    • 88.6 teraflops
  • GPU acceleration
    • Additional 65.5 teraflops
  • Total peak performance
    • 154.1 teraflops
  • Memory Increase
    • Increases memory from 2.5 gigabytes per core to 4.0 gigabytes per core.
  • Storage Expansion
    • Adds 600 terabytes of DataDirect Networks Lustre storage for a total of nearly two petabytes of available disk storage.
  • System Efficiency
    • 1.5x the performance of former system at just 60 percent of current power consumption.

How to Connect

To connect to Oakley, ssh to oakley.osc.edu.

Batch Specifics

We have recently updated qsub to provide more information to clients about the job they just submitted, including both informational (NOTE) and ERROR messages. To better understand these messages, please visit the messages from qsub page.

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • Compute nodes on Oakley are 12 cores/processors per node (ppn). Parallel jobs must use ppn=12 .
  • If you need more than 48 GB of RAM per node, you may run on the 8 large memory (192 GB) nodes  on Oakley ("bigmem"). You can request a large memory node on Oakley by using the following directive in your batch script: nodes=XX:ppn=12:bigmem , where XX can be 1-8.

  • We have a single huge memory node ("hugemem"), with 1 TB of RAM and 32 cores. You can schedule this node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=32 . This node is only for serial jobs, and can only have one job running on it at a time, so you must request the entire node to be scheduled on it. In addition, there is a walltime limit of 48 hours for jobs on this node.
Requesting less than 32 cores but a memory requirement greater than 192 GB will not schedule the 1 TB node! Just request nodes=1:ppn=32 with a walltime of 48 hours or less, and the scheduler will put you on the 1 TB node.
  • GPU jobs may request any number of cores and either 1 or 2 GPUs.  Request  2 GPUs per a node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=12:gpus=2

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.


Oakley Changelog

Nov 30 2016 - 4:07pm

The LAMMPS 14May16 known issue wherein parallel lammps spawned too many threads has been fixed on all clusters.  No user action is required; if a user had applied the OMP_NUM_THREADS workaround then it may be removed, but it will not cause probems if left in place. The corrected executables were made the defaults for module lammps/14may16 at these times:

Wed Nov 23 20:55:24 EST 2016

Mon Nov 28 21:46:24 EST 2016

Jul 29 2016 - 5:17pm

Amber 16 has been installed on the OSC clusters; usage is via the module amber/16. For information on available executables and installation details see the software page for Amber or the output of the module help command, e.g.: module help amber/16.  On August 15, 2016 Amber 16 will be made the default amber module.

Jun 28 2016 - 11:40pm

LAMMPS stable version 14May16 has been installed on Oakley.  Usage is via the module lammps/14May16.  For information on installation details, such as, available packages, see the output of the module help command, e.g.:  module help lammps/14May16

Jun 14 2016 - 3:01pm

All ACLs set within the Home Directory filesystem (/nfs/##) were lost during the 6/7 downtime.  This was caused by the migration to a new server that does not support the old POSIX ACLs.

Migrating the ACLs was not possible due to both the fact POSIX ACLs are not easily translatable to NFSv4 ACLs, and none of our tools supported such a migration.