Supercomputers

We currently operate three major systems:

  • Owens Cluster, a 23,000+ core Dell Intel Xeon machine available to clients later in 2016
  • Ruby Cluster, an 4800 core HP Intel Xeon machine
    • 20 nodes have Nvidia Tesla K40 GPUs
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs.
  • Oakley Cluster, an 8,300+ core HP Intel Xeon machineOakley computing cluster
    • One in every 10 nodes has 2 Nvidia Tesla GPU accelerators
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs

Our clusters share a common environment, and we have several guides available.

OSC also provides more than 5 PB of storage, and another 5.5 PB of tape backup.

  • Learn how that space is made available to users, and how to best utilize the resources, in our storage environment guide.

System Notices are available online.

Finally, you can keep up to date with any known issues on our systems (and the available workarounds). An archive of resolved issues can be found here.

Service: 

Oakley

TIP: Remember to check the menu to the right of the page for related pages with more information about Oakley's specifics.

Oakley is an HP-built, Intel® Xeon® processor-based supercomputer, featuring more cores (8,328) on half as many nodes (694) as the center’s former flagshipsystem, the IBM Opteron 1350 Glenn Cluster. The Oakley Cluster can achieve 88 teraflops, tech-speak for performing 88 trillion floating point operations per second, or, with acceleration from 128 NVIDIA® Tesla graphic processing units (GPUs), a total peak performance of just over 154 teraflops.

 

Hardware

Photo: OSC Oakley HP Intel Xeon ClusterDetailed system specifications:

  • 8,328 total cores
    • 12 cores/node  & 48 gigabytes of memory/node
  • Intel Xeon x5650 CPUs
  • HP SL390 G7 Nodes
  • 128 NVIDIA Tesla M2070 GPUs
  • 873 GB of local disk space in '/tmp'
  • QDR IB Interconnect
    • Low latency
    • High throughput
    • High quality-of-service.
  • Theoretical system peak performance
    • 88.6 teraflops
  • GPU acceleration
    • Additional 65.5 teraflops
  • Total peak performance
    • 154.1 teraflops
  • Memory Increase
    • Increases memory from 2.5 gigabytes per core to 4.0 gigabytes per core.
  • Storage Expansion
    • Adds 600 terabytes of DataDirect Networks Lustre storage for a total of nearly two petabytes of available disk storage.
  • System Efficiency
    • 1.5x the performance of former system at just 60 percent of current power consumption.

How to Connect

To connect to Oakley, ssh to oakley.osc.edu.

Batch Specifics

We have recently updated qsub to provide more information to clients about the job they just submitted, including both informational (NOTE) and ERROR messages. To better understand these messages, please visit the messages from qsub page.

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • Compute nodes on Oakley are 12 cores/processors per node (ppn). Parallel jobs must use ppn=12 .
  • If you need more than 48 GB of RAM per node, you may run on the 8 large memory (192 GB) nodes  on Oakley ("bigmem"). You can request a large memory node on Oakley by using the following directive in your batch script: nodes=XX:ppn=12:bigmem , where XX can be 1-8.

  • We have a single huge memory node ("hugemem"), with 1 TB of RAM and 32 cores. You can schedule this node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=32 . This node is only for serial jobs, and can only have one job running on it at a time, so you must request the entire node to be scheduled on it. In addition, there is a walltime limit of 48 hours for jobs on this node.
Requesting less than 32 cores but a memory requirement greater than 192 GB will not schedule the 1 TB node! Just request nodes=1:ppn=32 with a walltime of 48 hours or less, and the scheduler will put you on the 1 TB node.
  • GPU jobs may request any number of cores and either 1 or 2 GPUs.  Request  2 GPUs per a node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=12:gpus=2

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Batch Limit Rules

Memory Limit:

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. On Oakley, it equates to 4GB/core and 48GB/node.

If your job requests less than a full node ( ppn< 12), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4GB/core).  For example, without any memory request ( mem=XX ), a job that requests  nodes=1:ppn=1  will be assigned one core and should use no more than 4GB of RAM, a job that requests  nodes=1:ppn=3  will be assigned 3 cores and should use no more than 12GB of RAM, and a job that requests  nodes=1:ppn=12  will be assigned the whole node (12 cores) with 48GB of RAM.  However, a job that requests  nodes=1:ppn=1,mem=12GB  will be assigned one core but have access to 12GB of RAM, and charged for 3 cores worth of Resource Units (RU).  See Charging for memory use for more details.

A multi-node job ( nodes>1 ) will be assigned the entire nodes with 48GB/node and charged for the entire nodes regardless of ppn request. For example, a job that requests nodes=10:ppn=1 will be charged for 10 whole nodes (12 cores/node*10 nodes, which is 120 cores worth of RU). A job that requests large-memory node ( nodes=XX:ppn=12:bigmem, XX can be 1-8) will be allocated the entire large-memory node with 192GB of RAM and charged for the whole node (12 cores worth of RU). A job that requests huge-memory node ( nodes=1:ppn=32 ) will be allocated the entire huge-memory node with 1TB of RAM and charged for the whole node (32 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

GPU Limit:

On Oakley, GPU jobs may request any number of cores and either 1 or 2 GPUs ( nodes=XX:ppn=XX: gpus=1 or gpus=2 ). The memory limit depends on the ppn request and follows the rules in Memory Limit.

Walltime Limit

Here are the queues available on Oakley:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

168 hours

1 node

 

Longserial

336 hours

1 node

Restricted access

Parallel

96 hours

125 nodes

 

Longparallel

250 hours

230 nodes

Restricted access

Hugemem

48 hours

1 node

32 core with 1 TB RAM

nodes=1:ppn=32

Debug

1 hour

12 nodes

 

Job Limit

An individual user can have up to 128 concurrently running jobs and/or up to 1500 processors/cores in use. All the users in a particular group/project can among them have up to 192 concurrently running jobs and/or up to 1500 processors/cores in use. Jobs submitted in excess of these limits are queued but blocked by the scheduler until other jobs exit and free up resources.

A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately. Jobs submitted in excess of this limit will be rejected.

Supercomputer: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Oakley, please use the following Archival Resource Key:

ark:/19495/hpc0cvqn

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2012. Oakley Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc0cvqn

Here is the citation in BibTeX format:

@misc{Oakley2012,
ark = {ark:/19495/hpc0cvqn},
howpublished = {\url{http://osc.edu/ark:/19495/hpc0cvqn}},
year  = {2012},
author = {Ohio Supercomputer Center},
title = {Oakley Supercomputer}
}

And in EndNote format:

%0 Generic
%T Oakley Supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc0cvqn
%U http://osc.edu/ark:/19495/hpc0cvqn
%D 2012

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 

Queues and Reservations

Here are the queues available on Oakley. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Longserial

Available minus reservations

336 hours

1 node

Restricted access

Parallel

Available minus reservations

96 hours

125 nodes

 

Longparallel

Available minus reservations

250 hours

230 nodes

Restricted access

Hugemem

1

48 hours

1 node

 

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

In addition, there are a few standing reservations.

Name Times Nodes Available Max Walltime Max job size notes
Debug 8AM-6PM Weekdays 12 1 hour 12 nodes For small interactive and test jobs.
GPU ALL 62 336 hours 62 nodes

Small jobs not requiring GPUs from the serial and parallel queues will backfill on this reservation.

OneTB ALL 1 48 hours 1 node Holds the 32 core, 1 TB RAM node aside for the hugemem queue.

 

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: 

Ruby

Ruby is unavailable for general access. Please follow this link to request access.
TIP: Remember to check the menu to the right of the page for related pages with more information about Ruby's specifics.
On 10/13/2016, Intel Xeon Phi coprocessors on Ruby were removed from service. Please contact OSC Help if you have any questions or want to help get access to alternative resources. 

Ruby, named after the Ohio native actress Ruby Dee, is The Ohio Supercomputer Center's newest cluster.  An HP built, Intel® Xeon® processor-based supercomputer, Ruby provides almost the same amount of total computing power (~144 TF) as our former flagship system Oakley on less than half the number of nodes (240 nodes).  Ruby now has 20 nodes are outfitted with NVIDIA® Tesla K40 accelerators (Ruby used to feature two distinct sets of hardware accelerators; 20 nodes are outfitted with NVIDIA® Tesla K40 and another 20 nodes feature Intel® Xeon® Phi coprocessors).

Hardware

Detailed system specifications:

  • 4800 total cores
    • 20 cores/node  & 64 gigabytes of memory/node
  • Intel Xeon E5 2670 V2 (Ivy Bridge) CPUs
  • HP SL250 Nodes
  • 20 Intel Xeon Phi 5110p coprocessors (remove from service on 10/13/2016)
  • 20 NVIDIA Tesla K40 GPUs
  • 2 NVIDIA Tesla K20X GPUs 
    • Both equiped on single "debug" queue node
  • 1 TB of local disk space in '/tmp'
  • FDR IB Interconnect
    • Low latency
    • High throughput
    • High quality-of-service.
  • Theoretical system peak performance
    • 96 teraflops
  • NVIDIA GPU performance
    • 28.6 additional teraflops
  • Intel Xeon Phi performance
    • 20 additional teraflops
  • Total peak performance
    • ~144 teraflops

Ruby has one huge memory node.

  • 32 cores (Intel Xeon E5 4640 CPUs)
  • 1 TB of memory
  • 483 GB of local disk space in '/tmp'

Ruby is configured with two login nodes.

  • Intel Xeon E5-2670 (Sandy Bridge) CPUs
  • 16 cores/node & 128 gigabytes of memory/node

Connecting

To login to Ruby at OSC, ssh to the following hostname:

ruby.osc.edu 

You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:

ssh <username>@ruby.osc.edu

From there, you have access to the compilers and other software development tools. You can run programs interactively or through batch requests. See the following sections for details.

File Systems

Ruby accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley cluster. Full details of the storage environment are available in our storage environment guide.

Software Environment

The module system on Ruby is the same as on the Oakley system. Use  module load <package>  to add a software package to your environment. Use  module list  to see what modules are currently loaded and  module avail  to see the module that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use  module spider . By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded.

You can keep up to on the software packages that have been made available on Ruby by viewing the Software by System page and selecting the Ruby system.

Understanding the Xeon Phi

Guidance on what the Phis are, how they can be utilized, and other general information can be found on our Ruby Phi FAQ.

Compiling for the Xeon Phis

For information on compiling for and running software on our Phi coprocessors, see our Phi Compiling Guide.

Batch Specifics

We have recently updated qsub to provide more information to clients about the job they just submitted, including both informational (NOTE) and ERROR messages. To better understand these messages, please visit the messages from qsub page.

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • Compute nodes on Ruby have 20 cores/processors per node (ppn).  
  • If you need more than 64 GB of RAM per node you may run on Ruby's huge memory node ("hugemem").  This node has four Intel Xeon E5-4640 CPUs (8 cores/CPU) for a total of 32 cores.  The node also has 1TB of RAM.  You can schedule this node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=32 .  This node is only for serial jobs, and can only have one job running on it at a time, so you must request the entire node to be scheduled on it.  In addition, there is a walltime limit of 48 hours for jobs on this node.
  • 20 nodes on Ruby are equiped with a single NVIDIA Tesla K40 GPUs.  These nodes can be requested by adding gpus=1 to your nodes request, like so: #PBS -l nodes=1:ppn=20:gpus=1 .
    • By default a GPU is set to the Exclusive Process and Thread compute mode at the beginning of each job.  To request the GPU be set to Default compute mode, add default to your nodes request, like so: #PBS -l nodes=1:ppn=20:gpus=1:default .
  • Ruby has 5 debug nodes which are specifically configured for short (< 1 hour) debugging type work.  These nodes have a walltime limit of 1 hour.  These nodes are equiped with E5-2670 V1 CPUs with 16 cores per a node. 
    • To schedule a debug node:
      #PBS -l nodes=1:ppn=16 -q debug

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

 

Supercomputer: 
Service: 

Technical Specifications

The following are technical specifications for Ruby.  We hope these may be of use to the advanced user.

  Ruby System (2014)
Number oF nodes 240 nodes
Number of CPU Sockets 480 (2 sockets/node)
Number of CPU Cores 4800 (20 cores/node)
Cores per Node 20 cores/node
Local Disk Space per Node ~800GB in /tmp, SATA
Compute CPU Specifications

Intel Xeon E5-2670 V2

  • 2.5 GHz 
  • 10 cores per processor
Computer Server Specifications

200 HP SL230

40 HP SL250 (for accelerator nodes)

Accelerator Specifications

20 NVIDIA Tesla K40 

  • 1.43 TF peak double-precision performance
  • 1 GK110B GPU 
  • 2880 CUDA cores
  • 12GB memory

20 Intel Xeon Phi 5110p 

  • 1.011 TF peak performance
  • 60 cores
  • 1.053 GHz
  • 8GB memory
Number of accelerator Nodes

40 total 

  • 20 Xeon Phi equiped nodes
  • 20 NVIDIA Tesla K40 equiped nodes
Total Memory ~16TB
Memory Per Node

64GB

Memory Per Core 3.2GB
Interconnect  FDR/EN Infiniband (56 Gbps)
Login Specifications

2 Intel Xeon E5-2670

  • 2.6 GHz
  • 16 cores
  • 132GB memory
Special Nodes

Huge Memory (1)

  • Dell PowerEdge R820 Server
  • 4 Intel Xeon E5-4640 CPUs
    • 2.4 GHz
  • 32 cores (8 cores/CPU)
  • 1 TB Memory

 

Supercomputer: 

Programming Environment

Compilers

C, C++ and Fortran are supported on the Ruby cluster. Intel, PGI and GNU compiler suites are available. The Intel development tool chain is loaded by default. Compiler commands and recommended options for serial programs are listed in the table below. See also our compilation guide.

LANGUAGE INTEL EXAMPLE PGI EXAMPLE GNU EXAMPLE
C icc -O2 -xHost hello.c pgcc -fast hello.c gcc -O2 -march=native hello.c
Fortran 90 ifort -O2 -xHost hello.f90 pgf90 -fast hello.f90 gfortran -O2 -march=native hello.f90

Parallel Programming

MPI

The system uses the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. For more information on building your MPI codes, please visit the MPI Library documentation.

Ruby uses a different version of mpiexec than Oakley. This is necessary because of changes in Torque. All OSC systems use the mpiexec command, but the underlying code on Ruby is mpiexec.hydra while the code on Oakley was developed at OSC. They are largely compatible, but a few differences should be noted.

Caution: There are many variations on mpiexec and mpiexec.hydra. Information found on non-OSC websites may not be applicable to our installation.
Note: Oakley has been updated to use the same mpiexec as Ruby.

The table below shows some commonly used options. Use mpiexec -help for more information.

OAKLEY (old) RUBY COMMENT
mpiexec mpiexec Same command on both systems
mpiexec a.out mpiexec ./a.out Program must be in path on Ruby, not necessary on Oakley.
-pernode -ppn 1 One process per node
-npernode procs -ppn procs procs processes per node
-n totalprocs
-np totalprocs
-n totalprocs
-np totalprocs
At most totalprocs processes per node (same on both systems)
-comm none   Omit for simple cases. If using $MPIEXEC_RANK, consider using pbsdsh with $PBS_VNODENUM.
-comm anything_else   Omit. Ignored on Oakley, will fail on Ruby.
  -prepend-rank Prepend rank to output
-help -help Get a list of available options

mpiexec will normally spawn one MPI process per CPU core requested in a batch job. The -pernode option is not supported by mpiexec on Ruby, instead use -ppn 1 as mentioned in the table above.

OpenMP

The Intel, PGI and gnu compilers understand the OpenMP set of directives, which give the programmer a finer control over the parallelization. For more information on building OpenMP codes on OSC systems, please visit the OpenMP documentation.

GPU Programming

To request the GPU node on Ruby, use nodes=1:ppn=20:gpus=1. For GPU programming with CUDA, please refer to CUDA documentation. Also refer to the page of each software to check whether it is GPU enabled.

Supercomputer: 
Service: 

Executing Programs

Batch Requests

Batch requests are handled by the TORQUE resource manager and Moab Scheduler as on the Oakley system. Use the qsub command to submit a batch request, qstat to view the status of your requests, and qdel to delete unwanted requests. For more information, see the manual pages for each command.

There are some changes for Ruby, they are listed here:

  • Ruby nodes have 20 cores per node, and 64 GB of memory per node. This is less memory per core than on Oakley.
  • Ruby will be allocated on the basis of whole nodes even for jobs using less than 20 cores.
  • The amount of local disk space available on a node is approximately 800 GB.
  • MPI Parallel Programs should be run with mpiexec, as on Oakley, but the underlying program is mpiexec.hydra instead of OSC's mpiexec. Type mpiexec --help for information on the command line options.

Example Serial Job

This particular example uses OpenMP.

  #PBS -l walltime=1:00:00
  #PBS -l nodes=1:ppn=20
  #PBS -N my_job
  #PBS -j oe

  cd $TMPDIR
  cp $HOME/science/my_program.f .
  ifort -O2 -openmp my_program.f
  export OMP_NUM_PROCS=20
  ./a.out > my_results
  cp my_results $HOME/science

Please remember that jobs on Ruby must use a complete node.

Example Parallel Job

    #PBS -l walltime=1:00:00
    #PBS -l nodes=4:ppn=20
    #PBS -N my_job
    #PBS -j oe

    cd $HOME/science
    mpif90 -O3 mpiprogram.f
    cp a.out $TMPDIR
    cd $TMPDIR
    mpiexec ./a.out > my_results
    cp my_results $HOME/science

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Queues and Reservations

Here are the queues available on Ruby. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Parallel

Available minus reservations

96 hours

40 nodes

 

Hugemem

1

48 hours

1 node

32 core with 1 TB RAM
Debug 5 1 hour 2 nodes

16 core with 128GB RAM

For small interactive and test jobs. 

Use "-q debug" to request it 

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

Occasionally, reservations will be created for specific projects.

Approximately half of the Ruby nodes are a part of client condo reservations. Only jobs of short duration are eligible to run on these nodes, and only when they are not in use by the condo clients. As a result, your job(s) may have to wait for eligible resources to come available while it appears that much of the cluster is idle.
Supercomputer: 
Service: 

Batch Limit Rules

Full Node Charging Policy

On Ruby, we always allocate whole nodes to jobs and charge for the whole node. If a job requests less than a full node (nodes=1:ppn<20), the job execution environment is what is requested (the job only has access to the # of cores according to ppn request) with 64GB of RAM; however, the job will be allocated whole node and charge for the whole node. A job that requests nodes>1 will be assigned the entire nodes with 64GB/node and charged for the entire nodes regardless of ppn request.  A job that requests huge-memory node (nodes=1:ppn=32) will be allocated the entire huge-memory node with 1TB of RAM and charged for the whole node (32 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

Queue Default

Please keep in mind that if you submits a job with no node specification, the default is nodes=1:ppn=20, while if you submits a job with no ppn specified, the default is nodes=N:ppn=1

Debug Node

Ruby has 5 debug nodes which are specifically configured for short (< 1 hour) debugging type work. These nodes have a walltime limit of 1 hour. These nodes are equiped with E5-2670 V1 CPUs with 16 cores per a node.  To schedule a debug node, use nodes=1:ppn=16 -q debug

GPU Node

On Ruby, 20 nodes are equipped with NVIDIA Tesla K40 GPUs (one GPU with each node).  These nodes can be requested by adding gpus=1 to your nodes request (nodes=1:ppn=20:gpus=1). 

Walltime Limit

Here are the queues available on Ruby:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

168 hours

1 node

 

Parallel

96 hours

40 nodes

 

Hugemem

48 hours

1 node

32 core with 1 TB RAM

Debug

1 hour

6 nodes

16 core with 128GB RAM

Job Limit

An individual user can have up to 40 concurrently running jobs and/or up to 800 processors/cores in use. All the users in a particular group/project can among them have up to 80 concurrently running jobs and/or up to 1600 processors/cores in use if the system is busy. Debug queue is 1 job at a time per user. For Condo users, please contact OSC Help for more instructions.

Supercomputer: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Ruby, please use the following Archival Resource Key:

ark:/19495/hpc93fc8

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2015. Ruby Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc93fc8

Here is the citation in BibTeX format:

@article{Ruby2015,
ark = {ark:/19495/hpc93fc8},
url = {http://osc.edu/ark:/19495/hpc93fc8},
year  = {2015},
author = {Ohio Supercomputer Center},
title = {Ruby Supercomputer}
}

And in EndNote format:

%0 Generic
%T Ruby Supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc93fc8
%U http://osc.edu/ark:/19495/hpc93fc8
%D 2015

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 

Request Access

Projects who would like to use the Ruby cluster will need to request access.  This is because of the particulars of the Ruby environment, which includes its size, GPUs, and scheduling policies.  

Motivation

Access to Ruby is done on a case by case basis because:

  • It is a smaller machine than Oakley, and thus has limited space for users
    • Oakley has 694 nodes, while Ruby only has 240 nodes.
  • It's CPUs are less general, and therefore more consideration is required to get optimal performance
  • Scheduling is done on a per-node basis, and therefore jobs must scale to this level at a bare minimum 
  • Additional consideration is required to get full performance out of its GPUs

Good Ruby Workload Characteristics

Those interested in using Ruby should check that their work is well suited for it by using the following list.  Ideal workloads will exhibit one or more of the following characteristics:

  • Work scales well to large core counts
    • No single core jobs
    • Scales well past 2 nodes on Oakley
  • Needs access to Ruby specific hardware (GPUs)
  • Memory bound work
  • Software:
    • Supports GPUs
    • Takes advantage of:
      • Long vector length
      • Higher core count
      • Improved Memory Bandwidth

Applying for Access

Those who would like to be considered for Ruby access should send the following in a email to OSC Help:

  • Name
  • Project ID
  • Plan for using Ruby
  • Evidence of workload being well suited for Ruby

Owens

TIP: Remember to check the menu to the right of the page for related pages with more information about Owens' specifics.

OSC's Owens cluster being installed in 2016 is a Dell-built, Intel® Xeon® processor-based supercomputer. 

Hardware

Detailed system specifications:

  • 824 Dell Nodes
  • Dense Compute
    • 648 compute nodes (Dell PowerEdge C6320 two-socket servers with Intel Xeon E5-2680 v4 (Broadwell, 14 cores, 2.40GHz) processors, 128GB memory)

  • ​GPU Compute (not yet available)

    • 1​60 ‘GPU ready’ compute nodes (Dell PowerEdge R730 two-socket servers with Intel Xeon E5-2680 v4 (Broadwell, 14 cores, 2.40GHz) processors, 128GB memory) – we’ll be adding NVIDIA’s next gen ‘Pascal’ GPUs when they come out much later this year

  • ​Analytics

    • 16 big memory (Dell PowerEdge R930 four-socket server with Intel Xeon E5-4830 v3 (Haswell 12 core, 2.10GHz) processors, 1,536GB memory, 12 x 2TB drives)​

  • 23,392 total cores
    • 28 cores/node  & 128 gigabytes of memory/node
  • Mellanox EDR (100Gbps) Infiniband networking
  • Theoretical system peak performance
    • ~750 teraflops (CPU only)
  • Owens is configured with four login nodes:
    • Intel Xeon E5-2680 (Broadwell) CPUs
    • 28 cores/node and 256 GB of memory/node

How to Connect

To login to Owens at OSC, ssh to the following hostname:

owens.osc.edu 

You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:

ssh <username>@owens.osc.edu

From there, you have access to the compilers and other software development tools. You can run programs interactively or through batch requests. See the following sections for details.

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley and Ruby clusters. Full details of the storage environment are available in our storage environment guide.

Home directories should be accessed through either the $HOME environment variable or the tilde notation (~username). Project directories are located at /fs/project. Scratch storage is located at /fs/scratch.

Owens will not have symlinks allowing use of the old file system paths. This is in contrast to Oakley and Ruby, which will have the symlinks.

Software Environment

The module system on Owens is the same as on the Oakley and Ruby systems. Use  module load <package>  to add a software package to your environment. Use  module list  to see what modules are currently loaded and  module avail  to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use  module spider . By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded.

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Compiling Code to Use Advanced Vector Extensions (AVX2)

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3 . The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.

See the Owens Programming Environment page for details.

Batch Specifics

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • The qsub syntax for node requests is the same on Owens as on Ruby and Oakley (contrary to earlier reports).
  • Most compute nodes on Owens have 28 cores/processors per node (ppn).  Big-memory (analytics) nodes have 24 processors per node.
  • Jobs on Owens may request partial nodes.  This is in contrast to Ruby but similar to Oakley.
  • (Owens info TBD) Ruby has 5 debug nodes which are specifically configured for short (< 1 hour) debugging type work.  These nodes have a walltime limit of 1 hour.
    • To schedule a debug node:
      #PBS -l nodes=1:ppn=16 -q debug

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Owens Programming Environment

Compilers

C, C++ and Fortran are supported on the Owens cluster. Intel, PGI and GNU compiler suites are available. The Intel development tool chain is loaded by default. Compiler commands and recommended options for serial programs are listed in the table below. See also our compilation guide.

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.

LANGUAGE INTEL EXAMPLE PGI EXAMPLE GNU EXAMPLE
C icc -O2 -xHost hello.c pgcc -fast hello.c gcc -O3 -march=native hello.c
Fortran 90 ifort -O2 -xHost hello.f90 pgf90 -fast hello.f90 gfortran -O3 -march=native hello.f90
C++ icpc -O2 -xHost hello.cpp pgc++ -fast hello.cpp g++ -O3 -march=native hello.cpp

Parallel Programming

MPI

OSC systems use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. For more information on building your MPI codes, please visit the MPI Library documentation.

Parallel programs are started with the mpiexec command. For example,

mpiexec ./myprog

The program to be run must either be in your path or have its path specified.

The mpiexec command will normally spawn one MPI process per CPU core requested in a batch job. Use the -n and/or -ppn option to change that behavior.

The table below shows some commonly used options. Use mpiexec -help for more information.

MPIEXEC Option COMMENT
-ppn 1 One process per node
-ppn procs procs processes per node
-n totalprocs
-np totalprocs
At most totalprocs processes per node
-prepend-rank Prepend rank to output
-help Get a list of available options

 

Caution: There are many variations on mpiexec and mpiexec.hydra. Information found on non-OSC websites may not be applicable to our installation.

OpenMP

The Intel, PGI and gnu compilers understand the OpenMP set of directives, which support multithreaded programming. For more information on building OpenMP codes on OSC systems, please visit the OpenMP documentation.

GPU Programming

GPUs are not yet available on Owens.

Supercomputer: 
Service: 

Technical Specifications

The following are technical specifications for Owens.  We hope these may be of use to the advanced user.

  OWENS SYSTEM EARLY ACCESS (2016)
NUMBER OF NODES 340 nodes
NUMBER OF CPU SOCKETS 680 (2 sockets/node)
NUMBER OF CPU CORES 9520 (28 cores/node)
CORES PER NODE 28 cores/node
LOCAL DISK SPACE PER NODE ~850GB in /tmp, SATA
COMPUTE CPU SPECIFICATIONS

Intel Xeon E5-2680 v4

  • 2.4 GHz 
  • 14 cores per processor
COMPUTER SERVER SPECIFICATIONS

340 Dell PowerEdge C6320

160 Dell PowerEdge R730 (for accelerator nodes)

ACCELERATOR SPECIFICATIONS

NVIDIA's next gen 'Pascal' GPUs later this year

NUMBER OF ACCELERATOR NODES

160 total 

  • NVIDIA's next gen 'Pascal' GPUs later this year
TOTAL MEMORY ~67 TB
MEMORY PER NODE

128GB (1536 GB for Big Mem Nodes)

MEMORY PER CORE 9.1 GB
INTERCONNECT  Mellanox EDR Infiniband Networking (100Gbps)
LOGIN SPECIFICATIONS

2 Intel (Haswell)

  • 28 cores
SPECIAL NODES

 

Big Memory (16)

  • Dell PowerEdge R930 
  • 4 Intel Xeon E5-4830 v3
    • 12 Cores
    • 2.1 GHz
  • 48 cores (12 cores/CPU)
  • 1,536 TB Memory
  • 12 x 2 TB drive

 

The following are full technical specifications for Owens.

  Owens SYSTEM (2016)
NUMBER OF NODES 824 nodes
NUMBER OF CPU SOCKETS 1648 (2 sockets/node)
NUMBER OF CPU CORES 23,392 (28 cores/node)
CORES PER NODE 28 cores/node
LOCAL DISK SPACE PER NODE ~850GB in /tmp, SATA
COMPUTE CPU SPECIFICATIONS

Intel Xeon E5-2680 v4 (for compute)

  • 2.4 GHz 
  • 14 cores per processor
COMPUTER SERVER SPECIFICATIONS

648 Dell PowerEdge C6320

160 Dell PowerEdge R730 (for accelerator nodes)

ACCELERATOR SPECIFICATIONS

NVIDIA's next gen 'Pascal' GPUs later this year

NUMBER OF ACCELERATOR NODES

160 total 

  • NVIDIA's next gen 'Pascal' GPUs later this year
TOTAL MEMORY ~ 127 TB
MEMORY PER NODE

128 GB (1.5 TB for Big Mem Nodes)

MEMORY PER CORE 9.1 GB
INTERCONNECT  Mellanox EDR Infiniband Networking (100Gbps)
LOGIN SPECIFICATIONS

2 Intel (Haswell)

  • 28 cores
SPECIAL NODES

 

Big Memory (16)

  • Dell PowerEdge R930 
  • 4 Intel Xeon E5-4830 v3
    • 12 Cores
    • 2.1 GHz
  • 48 cores (12 cores/CPU)
  • 1.5 TB Memory
  • 12 x 2 TB Drive

 

Supercomputer: 
Service: 

Batch Limit Rules

Memory Limit:

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. On Owens, it equates to 4GB/core or 124GB/node.

If your job requests less than a full node ( ppn< 28 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4GB/core).  For example, without any memory request ( mem=XX ), a job that requests  nodes=1:ppn=1  will be assigned one core and should use no more than 4GB of RAM, a job that requests  nodes=1:ppn=3  will be assigned 3 cores and should use no more than 12GB of RAM, and a job that requests  nodes=1:ppn=28  will be assigned the whole node (28 cores) with 124GB of RAM.  

Please be careful if you include memory request (mem=XX ) in your job. A job that requests  nodes=1:ppn=1,mem=12GB  will be assigned one core and have access to 12GB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests  nodes=1:ppn=5,mem=12GB  will be assigned 5 cores but have access to only 12GB of RAM, and charged for 5 cores worth of Resource Units (RU).  See Charging for memory use for more details

A multi-node job ( nodes>1 ) will be assigned the entire nodes with 124 GB/node and charged for the entire nodes regardless of ppn request. For example, a job that requests  nodes=10:ppn=1 will be charged for 10 whole nodes (28 cores/node*10 nodes, which is 280 cores worth of RU).  

A job that requests huge-memory node ( nodes=1:ppn=48  ) will be allocated the entire huge-memory node with 1.5 TB of RAM and charged for the whole node (48 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

Walltime Limit

Here are the queues available on Owens:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

 168 hours

1 node

 

Parallel

96 hours

27 nodes

 

Largeparallel

96 hours

81 nodes

 

Hugemem

96 hours

1 node

16 nodes in this class

Debug

1 hour

2 nodes

  • 6 nodes in this class
  • Use "-q debug" to request it 

Job Limit

An individual user can have up to 256 concurrently running jobs and/or up to 3080 processors/cores in use. All the users in a particular group/project can among them have up to 384 concurrently running jobs and/or up to 3080 processors/cores in use. Jobs submitted in excess of these limits are queued but blocked by the scheduler until other jobs exit and free up resources.

A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately. 

Supercomputer: 
Service: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Owens, please use the following Archival Resource Key:

ark:/19495/hpc6h5b1

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2016. Owens Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc6h5b1

Here is the citation in BibTeX format:

@article{Owens2016,
ark = {ark:/19495/hpc93fc8},
url = {http://osc.edu/ark:/19495/hpc6h5b1},
year  = {2016},
author = {Ohio Supercomputer Center},
title = {Owens supercomputer}
}

And in EndNote format:

%0 Generic
%T Owens supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc6h5b1
%U http://osc.edu/ark:/19495/hpc6h5b1
%D 2016

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 
Service: 

Migrating jobs from Oakley or Ruby to Owens

This page includes a summary of differences to keep in mind when migrating jobs from Oakley or Ruby to Owens

Guidance for Oakley Users

Hardware Specifications

  Owens (per node) oakley (per node)
Most compute node 28 cores and 125GB of RAM 12 cores and 48GB of RAM
Large memory node    

12 cores and 192GB of RAM

(8 nodes in this class)

Huge memory node

48 cores and 1.5 TB of RAM, 12 x 2TB drives

(16 nodes in this class)

32 cores and 1TB of RAM

(1 node in this class)

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley cluster.

    owens Oakley
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Owens cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Oakley cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Owens uses the same module system as Oakley.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Programming Environment

Like Oakley, Owens supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Owens also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect.

In addition, Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

See the Owens Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Owens. 

On Oakley, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The  qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Owens, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

Accounting

The Owens cluster will charged at a rate of 1 RU per 10 core-hours.

The Oakley cluster will be charged at a rate of 1 RU per 20 core-hours.

Like Oakley, Owens will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests.

Below is a comparison of job limits between Owens and Oakley:

  owens oakley
Per User Up to 256 concurrently running jobs and/or up to 3080 processors/cores in use  Up to 128 concurrently running jobs and/or up to 1500 processors/cores in use
Per group Up to 384 concurrently running jobs and/or up to 3080 processors/cores in use Up to 192 concurrently running jobs and/or up to 1500 processors/cores in use

 

Please see Queues and Reservations for Owens for more details.

Guidance for Ruby Users

Hardware Specifications

  OWENS (PER NODE) Ruby (PER NODE)
Most compute node 28 cores and 125GB of RAM 20 cores and 64GB of RAM
Huge memory node

48 cores and 1.5 TB of RAM, 12 x 2TB drives

(16 nodes in this class)

32 cores and 1TB of RAM 

(1 node in this class)

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Ruby cluster.

    OWENS ruby
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Owens cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Oakley cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Owens uses the same module system as Ruby.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Programming Environment

Like Ruby, Owens supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Owens also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect.

In addition, Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

See the Owens Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Owens. 

On Ruby, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The   qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Owens, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

Accounting

The Owens cluster will charged at a rate of 1 RU per 10 core-hours.

The Ruby cluster will be charged at a rate of 1 RU per 20 core-hours.

However, Owens will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests. By contrast, Ruby only accepts full-node jobs and charge for the whole node. 

Below is a comparison of job limits between Owens and Ruby:

  OWENS Ruby
Per User Up to 256 concurrently running jobs and/or up to 3080 processors/cores in use  Up to 40 concurrently running jobs and/or up to 800 processors/cores in use
Per group Up to 384 concurrently running jobs and/or up to 3080 processors/cores in use Up to 80 concurrently running jobs and/or up to 1600 processors/cores in use

 

Please see Queues and Reservations for Owens for more details.

 

Supercomputer: 
Service: 

Queues and Reservations

Here are the queues available on Owens. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Parallel

Available minus reservations

 96 hours

27 nodes

 

Largeparallel

Available minus reservations

96 hours

81 nodes

 

Hugemem

16

96 hours

1 node

 
Debug 8 1 hour 2 nodes

For small interactive and test jobs. 

Use "-q debug" to request it 

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

 

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: 

Owens Early User Information

Early Access Period

Owens is expected to be available to all OSC clients on a later date to be announced. A small number of projects will be given access during the preceding several weeks to help us with testing and to provide feedback. Early access is by application only; the application deadline has passed.

Early access will be granted in several stages beginning August 22, 2016. Applicants will receive notification of their access date via ServiceNow.

During the early access period there will be no charges for Owens jobs. Charging will begin when Owens is opened up for general access. The rate for owens will be 1 RU per 10 core hours when it opens for general access.

System Instability – Warning

Please be aware that the system may go down with little or no warning during the early access period. If your work won’t tolerate this level of instability, we recommend that you use Oakley or Ruby instead.

Connecting to Owens

Early access to Owens will be granted to all members of selected projects. Access is controlled by the linux secondary group “owens.” If your project is selected for early access you will be added to this group.

ssh owens.osc.edu

Changes to qsub

The qsub syntax for node requests is the same on Owens as on Ruby and Oakley, contrary to earlier announcements.

Job Performance Reports

Note:  You should run performance reports on only a small number of moderate-size jobs. We have a limited number of licenses.

We are requesting that all early users on Owens provide OSC with performance reports for a sampling of their jobs. The reports are single-page html documents generated by Allinea's perf-report tool. They provide high-level summaries that you can use to understand and improve the performance of your jobs.

OSC staff will review this information to help us understand the overall performance of the system. We will also provide assistance to individual users and projects to improve job efficiency.

Generating a performance report requires just a simple and minimally invasive modification to your job script. In all cases you must load the allinea module:

module load allinea
Applications started with mpiexec/mpirun

If you normally run your application as

mpiexec <mpi args> <program> <program args>

you should run it like this:

perf-report --np=<num procs> --mpiargs="<mpi args>" <program> <program args>

The mpiargs argument can be omitted if you aren't passing arguments to mpiexec. The np argument is required and is the total number of MPI processes to be started.

Serial and threaded applications

If your application does not use MPI, you should run it like this:

perf-report <program> <program args>
If it doesn't work -- special cases

1.  If your program is statically linked you may need to compile with extra flags to allow perf-report to work. Contact oschelp@osc.edu for assistance

2.  If you have an MPI program but you don't explicitly use mpiexec or mpirun, try this:

perf-report <program> <program args>

If it doesn't work, contact oschelp@osc.edu for assistance.

Retrieving your report

These commands will generate html and plain text files for the report, for example,  wavec_20p_2016-02-05_12-46.html . You can open the report in html format using

firefox wavec_20p_2016-02-05_12-46.html

Note:  If your job runs in $TMPDIR you'll need to add a line to your script to copy the performance report back to your working directory. You can specify the name and/or location of the report files using the "--output=" option.

For more information, see:

https://www.osc.edu/documentation/software_list/allinea_performance_reports

http://www.allinea.com/product-documentation

Getting Support

Please send your support requests to oschelp@osc.edu rather than to individual staff members. You can help us out by formatting the subject of your email as follows:

[owens][usrname] Informative description including software package if applicable

Include details of the problem with job IDs, complete error messages, and commands you executed leading up to the problem.

We know you’re going to discover problems that we didn’t encounter during our testing. We appreciate your patience as we work to fix them.

Hardware Availability

Full hardware availability (except GPUs) is planned for November 1, 2016. The following hardware will be available during the early access period:

340 Broadwell nodes, each with 28 cores and 128GB memory

8 big memory Haswell nodes, each with 24 cores and 1.5TB memory

8 big memory / big disk Haswell nodes, each with 24 cores, 1.5TB memory, 24 x 2TB drives

2 login nodes, Haswell, 28 cores each

Note:  160 nodes are “GPU ready” but GPUs won’t be purchased until NVIDIA’s next generation Pascal GPUs are available.

Software Availability

The table below shows software that will be available at the start of the early access period. Installation of other software will be ongoing. The supercomputing software pages on the OSC website will be kept up to date as new software is installed.

If you find that software is missing or misconfigured, please report it as described above under "Getting Support."

Software included with the system (no module necessary)

Software

Version

Notes

gcc

4.8.5

 

cmake

2.8.11

 

python 

2.7.5

 

git

1.8.3.1

 

gsl

1.15

 

gnuplot

4.6 patchlevel 2

 
java 1.8.0_91  
boost 1.53.0  
papi 5.2.0  
libpng 1.5.13  
cairo 1.14.2  

Compilers and general applications

Software

Version

Notes

gcc

6.1.0

MPI not yet installed

Python

2.7.11

Anaconda bundle

Python

3.5.1

Anaconda bundle

R

3.3

 

Intel

16.0.3

 

MKL 11.3.3  
intelmpi 5.1.3  

ncview

2.1.7

 

paraview

4.4.0

 

MATLAB

R2015b

 

Totalview 2016.04.08  
Allinea 6.0.6  

Libraries (built for Intel 16.0 and gcc 4.8.5)

Software

Version

Notes

mvapich2

2.2

Currently 2.2rc1

fftw3

3.3.4

 

scalapack

2.0.2

 

hdf5

1.8.17

Parallel and serial installations

netcdf

4.3.3.1

Parallel and serial installations

pnetcdf

1.7.0

 

openmpi

1.10.3

 

metis

5.1.0

 

parmetis 4.0.3  

 

Bioinformatics software

Software

Version

Notes

Bedtools

2.25.0

 

Bowtie1

1.1.2

 
Bowtie2 2.2.9  

BWA

0.7.13

 

GATK

3.5

 

SAMtools

1.3.1

 

Picard

2.3.0

 

MuTect

1.1.4

 

STAR

2.5.2a

 

VarScan

2.4.1

 

Bcftools

1.3.1

 

VCFtools

0.1.14

 

Subread

1.5.0-p2

 

BamTools

2.2.2

 

eXpress

1.5.1

 

RNA-SeQC

1.1.8

 

SRA Toolkit

2.6.3

 

SnpEff

4.2

 

FASTX-Toolkit

0.0.14

 

GMAP

2016-05-25

 

Bam2fastq

1.1.0

 

Trimmomatic

0.36

 

MIRA

4.0.2

 

HOMER

4.8

 

STAR-Fusion

0.7.0

 

miRDeep2

2.0.0.8

 

Chemistry software

Software

Version

Notes

LAMMPS

14May16

 

Gromacs

5.1.2

 

NAMD

2.11

 

Gaussian

g09e01

 

Amber

16

 

Turbomole

7.1

 

Other applications

Software

Version

Notes

warp3d

17.6.1

 

OpenFOAM

2.2.2

 

Supercomputer: