Technical Support

Technical Support icon

OSC Help consists of technical support and consulting services for OSC's high performance computing resources. Members of OSC's HPC Client Services group comprise OSC Help.

Before contacting OSC Help, please check to see if your question is answered in either the FAQ or the Knowledge Base. Many of the questions asked by both new and experienced OSC users are answered in these web pages.

If you still cannot solve your problem, please do not hesitate to contact OSC Help:

Toll Free: (800) 686-6472
Local: (614) 292-1800
Email: oschelp@osc.edu
Submit your issue online

OSC Help hours of operation:

Level 1 support is available 24x7x365
Level 2 advanced support is available Monday through Friday, 9 am - 5 pm, except OSU holidays

OSC users also have the ability to directly impact OSC operational decisions by participating in the Statewide Users Group. Activities include managing the allocation process, advising on software licensing and hardware acquisition.

We recommend following HPCNotices on Twitter to get up-to-the-minute information on system outages and important operations-related updates.

HPC Changelog

Changes to HPC systems are listed below, optionally filtered by system.

MVAPICH2 version 2.3 modules modified on Owens

Replace MV2_ENABLE_AFFINITY=0 with MV2_CPU_BINDING_POLICY=hybrid.

Known Issues

Search Documentation

Search our client documentation below, optionally filtered by one or more systems.

Supercomputer: 

Supercomputers

We currently operate three major systems:

  • Owens Cluster, a 23,000+ core Dell Intel Xeon machine
  • Ruby Cluster, an 4800 core HP Intel Xeon machine
    • 20 nodes have Nvidia Tesla K40 GPUs
    • One node has 1 TB of RAM and 32 cores, for large SMP style jobs.
  • Pitzer Cluster, an 10,500+ core Dell Intel Xeon machine

Our clusters share a common environment, and we have several guides available.

OSC also provides more than 5 PB of storage, and another 5.5 PB of tape backup.

  • Learn how that space is made available to users, and how to best utilize the resources, in our storage environment guide.

Finally, you can keep up to date with any known issues on our systems (and the available workarounds). An archive of resolved issues can be found here.

Service: 

Ruby

Ruby is unavailable for general access. Please follow this link to request access.
TIP: Remember to check the menu to the right of the page for related pages with more information about Ruby's specifics.
On 10/13/2016, Intel Xeon Phi coprocessors on Ruby were removed from service. Please contact OSC Help if you have any questions or want to help get access to alternative resources. 

Ruby is named after the Ohio native actress Ruby Dee.  An HP built, Intel® Xeon® processor-based supercomputer, Ruby provides almost the same amount of total computing power (~125 TF, used to be ~144 TF with Intel® Xeon® Phi coprocessors) as our former flagship system Oakley on less than half the number of nodes (240 nodes).  Ruby now has 20 nodes are outfitted with NVIDIA® Tesla K40 accelerators (Ruby used to feature two distinct sets of hardware accelerators; 20 nodes are outfitted with NVIDIA® Tesla K40 and another 20 nodes feature Intel® Xeon® Phi coprocessors).

Ruby infographic

Hardwareruby_image

Detailed system specifications:

  • 4800 total cores
    • 20 cores/node  & 64 gigabytes of memory/node
  • Intel Xeon E5 2670 V2 (Ivy Bridge) CPUs
  • HP SL250 Nodes
  • 20 Intel Xeon Phi 5110p coprocessors (remove from service on 10/13/2016)
  • 20 NVIDIA Tesla K40 GPUs
  • 2 NVIDIA Tesla K80 GPUs 
    • Both equipped on single "debug" queue node
  • 850 GB of local disk space in '/tmp'
  • FDR IB Interconnect
    • Low latency
    • High throughput
    • High quality-of-service.
  • Theoretical system peak performance
    • 96 teraflops
  • NVIDIA GPU performance
    • 28.6 additional teraflops
  • Intel Xeon Phi performance
    • 20 additional teraflops
  • Total peak performance
    • ~125 teraflops

Ruby has one huge memory node.

  • 32 cores (Intel Xeon E5 4640 CPUs)
  • 1 TB of memory
  • 483 GB of local disk space in '/tmp'

Ruby is configured with two login nodes.

  • Intel Xeon E5-2670 (Sandy Bridge) CPUs
  • 16 cores/node & 128 gigabytes of memory/node

How to Connect

  • SSH Method

To login to Ruby at OSC, ssh to the following hostname:

ruby.osc.edu 

You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:

ssh <username>@ruby.osc.edu

You may see warning message including SSH key fingerprint. Verify that the fingerprint in the message matches one of the SSH key fingerprint listed here, then type yes.

From there, you are connected to Ruby login node and have access to the compilers and other software development tools. You can run programs interactively or through batch requests. We use control groups on login nodes to keep the login nodes stable. Please use batch jobs for any compute-intensive or memory-intensive work. See the following sections for details.

  • OnDemand Method

You can also login to Ruby at OSC with our OnDemand tool. The first is step is to login to OnDemand. Then once logged in you can access Ruby by clicking on "Clusters", and then selecting ">_Ruby Shell Access".

Instructions on how to connect to OnDemand can be found at the OnDemand documentation page.

File Systems

Ruby accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley Cluster. Full details of the storage environment are available in our storage environment guide.

Software Environment

The module system on Ruby is the same as on the Oakley system. Use  module load <package>  to add a software package to your environment. Use  module list  to see what modules are currently loaded and  module avail  to see the module that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use  module spider . By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded.

You can keep up to on the software packages that have been made available on Ruby by viewing the Software by System page and selecting the Ruby system.

Understanding the Xeon Phi

Guidance on what the Phis are, how they can be utilized, and other general information can be found on our Ruby Phi FAQ.

Compiling for the Xeon Phis

For information on compiling for and running software on our Phi coprocessors, see our Phi Compiling Guide.

Batch Specifics

We have recently updated  qsub  to provide more information to clients about the job they just submitted, including both informational (NOTE) and ERROR messages. To better understand these messages, please visit the messages from  qsub  page.

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • Compute nodes on Ruby have 20 cores/processors per node (ppn).  
  • If you need more than 64 GB of RAM per node you may run on Ruby's huge memory node ("hugemem").  This node has four Intel Xeon E5-4640 CPUs (8 cores/CPU) for a total of 32 cores.  The node also has 1TB of RAM.  You can schedule this node by adding the following directive to your batch script: #PBS -l nodes=1:ppn=32 .  This node is only for serial jobs, and can only have one job running on it at a time, so you must request the entire node to be scheduled on it.  In addition, there is a walltime limit of 48 hours for jobs on this node.
  • 20 nodes on Ruby are equipped with a single NVIDIA Tesla K40 GPUs.  These nodes can be requested by adding gpus=1 to your nodes request, like so: #PBS -l nodes=1:ppn=20:gpus=1 .
    • By default a GPU is set to the Exclusive Process and Thread compute mode at the beginning of each job.  To request the GPU be set to Default compute mode, add default to your nodes request, like so: #PBS -l nodes=1:ppn=20:gpus=1:default .
  • Ruby has 4 debug nodes (2 non-GPU nodes, as well as 2 GPU nodes, with 2 GPUs per node), which are specifically configured for short (< 1 hour) debugging type work.  These nodes have a walltime limit of 1 hour.  These nodes are equipped with E5-2670 V1 CPUs with 16 cores per a node. 
    • To schedule a non-GPU debug node:
      #PBS -l nodes=1:ppn=16 -q debug
    • To schedule a GPU debug node:
      #PBS -l nodes=1:ppn=16:gpus=2 -q debug

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

 

Supercomputer: 
Service: 

Technical Specifications

The following are technical specifications for Ruby. 

  Ruby System (2014)
Number oF nodes 240 nodes
Number of CPU Sockets 480 (2 sockets/node)
Number of CPU Cores 4800 (20 cores/node)
Cores per Node 20 cores/node (32 cores/node for Huge Mem Node)
Local Disk Space per Node ~850GB in /tmp, SATA
Compute CPU Specifications

Intel Xeon E5-2670 V2 (Ivy Bridge) for compute

  • 2.5 GHz 
  • 10 cores per processor
Computer Server Specifications

200 HP SL230

40 HP SL250 (for accelerator nodes)

Accelerator Specifications

20 NVIDIA Tesla K40 

  • 1.43 TF peak double-precision performance
  • 1 GK110B GPU 
  • 2880 CUDA cores
  • 12GB memory
Number of accelerator Nodes

20 total 

  • 20 NVIDIA Tesla K40 equiped nodes
Total Memory ~16TB
Memory Per Node

64GB 

Memory Per Core 3.2GB
Interconnect  FDR/EN Infiniband (56 Gbps)
Login Specifications

2 Intel Xeon E5-2670 (Sandy Bridge) CPUs

  • 2.6 GHz
  • 16 cores/node
  • 128GB of memory/node
Special Nodes

1 Huge Memory Node

  • Dell PowerEdge R820 Server
  • 4 Intel Xeon E5-4640 CPUs
    • 2.4 GHz
  • 32 cores (8 cores/CPU)
  • 1 TB Memory

 

Supercomputer: 

Programming Environment

Compilers

C, C++ and Fortran are supported on the Ruby cluster. Intel, PGI and GNU compiler suites are available. The Intel development tool chain is loaded by default. Compiler commands and recommended options for serial programs are listed in the table below. See also our compilation guide.

LANGUAGE INTEL EXAMPLE PGI EXAMPLE GNU EXAMPLE
C icc -O2 -xHost hello.c pgcc -fast hello.c gcc -O2 -march=native hello.c
Fortran 90 ifort -O2 -xHost hello.f90 pgf90 -fast hello.f90 gfortran -O2 -march=native hello.f90

Parallel Programming

MPI

The system uses the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. For more information on building your MPI codes, please visit the MPI Library documentation.

Ruby uses a different version of mpiexec than Oakley. This is necessary because of changes in Torque. All OSC systems use the mpiexec command, but the underlying code on Ruby is mpiexec.hydra while the code on Oakley was developed at OSC. They are largely compatible, but a few differences should be noted.

Caution: There are many variations on mpiexec and mpiexec.hydra. Information found on non-OSC websites may not be applicable to our installation.
Note: Oakley has been updated to use the same mpiexec as Ruby.

The table below shows some commonly used options. Use mpiexec -help for more information.

OAKLEY (old) RUBY COMMENT
mpiexec mpiexec Same command on both systems
mpiexec a.out mpiexec ./a.out Program must be in path on Ruby, not necessary on Oakley.
-pernode -ppn 1 One process per node
-npernode procs -ppn procs procs processes per node
-n totalprocs
-np totalprocs
-n totalprocs
-np totalprocs
At most totalprocs processes per node (same on both systems)
-comm none   Omit for simple cases. If using $MPIEXEC_RANK, consider using pbsdsh with $PBS_VNODENUM.
-comm anything_else   Omit. Ignored on Oakley, will fail on Ruby.
  -prepend-rank Prepend rank to output
-help -help Get a list of available options

mpiexec will normally spawn one MPI process per CPU core requested in a batch job. The -pernode option is not supported by mpiexec on Ruby, instead use -ppn 1 as mentioned in the table above.

OpenMP

The Intel, PGI and gnu compilers understand the OpenMP set of directives, which give the programmer a finer control over the parallelization. For more information on building OpenMP codes on OSC systems, please visit the OpenMP documentation.

GPU Programming

To request the GPU node on Ruby, use nodes=1:ppn=20:gpus=1. For GPU programming with CUDA, please refer to CUDA documentation. Also refer to the page of each software to check whether it is GPU enabled.

Supercomputer: 
Service: 

Executing Programs

Batch Requests

Batch requests are handled by the TORQUE resource manager and Moab Scheduler as on the Oakley system. Use the qsub command to submit a batch request, qstat to view the status of your requests, and qdel to delete unwanted requests. For more information, see the manual pages for each command.

There are some changes for Ruby, they are listed here:

  • Ruby nodes have 20 cores per node, and 64 GB of memory per node. This is less memory per core than on Oakley.
  • Ruby will be allocated on the basis of whole nodes even for jobs using less than 20 cores.
  • The amount of local disk space available on a node is approximately 800 GB.
  • MPI Parallel Programs should be run with mpiexec, as on Oakley, but the underlying program is mpiexec.hydra instead of OSC's mpiexec. Type mpiexec --help for information on the command line options.

Example Serial Job

This particular example uses OpenMP.

  #PBS -l walltime=1:00:00
  #PBS -l nodes=1:ppn=20
  #PBS -N my_job
  #PBS -j oe

  cd $TMPDIR
  cp $HOME/science/my_program.f .
  ifort -O2 -openmp my_program.f
  export OMP_NUM_PROCS=20
  ./a.out > my_results
  cp my_results $HOME/science

Please remember that jobs on Ruby must use a complete node.

Example Parallel Job

    #PBS -l walltime=1:00:00
    #PBS -l nodes=4:ppn=20
    #PBS -N my_job
    #PBS -j oe

    cd $HOME/science
    mpif90 -O3 mpiprogram.f
    cp a.out $TMPDIR
    cd $TMPDIR
    mpiexec ./a.out > my_results
    cp my_results $HOME/science

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Queues and Reservations

Here are the queues available on Ruby. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Parallel

Available minus reservations

96 hours

40 nodes

 

Hugemem

1

48 hours

1 node

32 core with 1 TB RAM

Use "-l nodes=1:ppn=32" to request it.

Debug

2 non-GPU nodes

2 GPU nodes (each with 2 GPUs)

1 hour 2 nodes

16 core with 128GB RAM

For small interactive and test jobs.

Use "-q debug" to request it.

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

Occasionally, reservations will be created for specific projects.

Approximately half of the Ruby nodes are a part of client condo reservations. Only jobs of short duration are eligible to run on these nodes, and only when they are not in use by the condo clients. As a result, your job(s) may have to wait for eligible resources to come available while it appears that much of the cluster is idle.
Supercomputer: 
Service: 

Batch Limit Rules

Full Node Charging Policy

On Ruby, we always allocate whole nodes to jobs and charge for the whole node. If a job requests less than a full node (nodes=1:ppn<20), the job execution environment is what is requested (the job only has access to the # of cores according to ppn request) with 64GB of RAM; however, the job will be allocated whole node and charge for the whole node. A job that requests nodes>1 will be assigned the entire nodes with 64GB/node and charged for the entire nodes regardless of ppn request.  A job that requests huge-memory node (nodes=1:ppn=32) will be allocated the entire huge-memory node with 1TB of RAM and charged for the whole node (32 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

Queue Default

Please keep in mind that if you submits a job with no node specification, the default is nodes=1:ppn=20, while if you submits a job with no ppn specified, the default is nodes=N:ppn=1

Debug Node

Ruby has 4 debug nodes which are specifically configured for short (< 1 hour) debugging type work. These nodes have a walltime limit of 1 hour. These nodes, consisting of 2 non-GPU nodes and 2 GPU nodes (with 2 GPUs per node), are equipped with E5-2670 V1 CPUs with 16 cores per a node. Users are allowed to request a partial node with debug nodes. 

  • To schedule a 1-core non-GPU debug nodes: nodes=1:ppn=1 -q debug
  • To schedule a non-GPU debug node: nodes=1:ppn=16 -q debug
  • To schedule two non-GPU debug nodes: nodes=2:ppn=16 -q debug
  • To schedule a GPU debug node: nodes=1:ppn=16:gpus=2 -q debug
  • To schedule two GPU debug nodes: nodes=2:ppn=16:gpus=2 -q debug

GPU Node

On Ruby, 20 nodes are equipped with NVIDIA Tesla K40 GPUs (one GPU with each node).  These nodes can be requested by adding gpus=1 to your nodes request (nodes=1:ppn=20:gpus=1). 

Walltime Limit

Here are the queues available on Ruby:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

168 hours

1 node

 

Parallel

96 hours

40 nodes

 

Hugemem

48 hours

1 node

32 core with 1 TB RAM

Debug

1 hour

2 nodes (either GPU or non-GPU)

16 core with 128GB RAM

Job/Core Limits

  Soft Max Running Job limit Hard Max Running Job Limit Soft Max Core Limit Hard Max Core Limit
Individual User 40 40 800 800
Project/Group 80 160 1600 3200

The soft and hard max limits above apply depending on different system resource availability. If resources are scarce, then the soft max limit is used to increase the fairness of allocating resources. Otherwise, if there are idle resources, then the hard max limit is used to increase system utilization.

An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use. 

However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

Debug queue is one job at a time per user. Condo users, please contact OSC Help for more instructions.
A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately.
Supercomputer: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Ruby, please use the following Archival Resource Key:

ark:/19495/hpc93fc8

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2015. Ruby Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc93fc8

Here is the citation in BibTeX format:

@misc{Ruby2015,
ark = {ark:/19495/hpc93fc8},
url = {http://osc.edu/ark:/19495/hpc93fc8},
year  = {2015},
author = {Ohio Supercomputer Center},
title = {Ruby Supercomputer}
}

And in EndNote format:

%0 Generic
%T Ruby Supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc93fc8
%U http://osc.edu/ark:/19495/hpc93fc8
%D 2015

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 

Request Access

Users who would like to use the Ruby cluster will need to request access.  This is because of the particulars of the Ruby environment, which includes its size, GPUs, and scheduling policies.  

Motivation

Access to Ruby is done on a case by case basis because:

  • It is a smaller machine than Oakley, and thus has limited space for users
    • Oakley has 694 nodes, while Ruby only has 240 nodes.
  • It's CPUs are less general, and therefore more consideration is required to get optimal performance
  • Scheduling is done on a per-node basis, and therefore jobs must scale to this level at a bare minimum 
  • Additional consideration is required to get full performance out of its GPUs

Good Ruby Workload Characteristics

Those interested in using Ruby should check that their work is well suited for it by using the following list.  Ideal workloads will exhibit one or more of the following characteristics:

  • Work scales well to large core counts
    • No single core jobs
    • Scales well past 2 nodes on Oakley
  • Needs access to Ruby specific hardware (GPUs)
  • Memory bound work
  • Software:
    • Supports GPUs
    • Takes advantage of:
      • Long vector length
      • Higher core count
      • Improved Memory Bandwidth

Applying for Access

Those who would like to be considered for Ruby access should send the following in a email to OSC Help:

  • Name
  • Username
  • Plan for using Ruby
  • Evidence of workload being well suited for Ruby

Ruby SSH key fingerprints

These are the public key fingerprints for Ruby:
ruby: ssh_host_key.pub = 01:21:16:c4:cd:43:d3:87:6d:fe:da:d1:ab:20:ba:4a
ruby: ssh_host_rsa_key.pub = eb:83:d9:ca:88:ba:e1:70:c9:a2:12:4b:61:ce:02:72
ruby: ssh_host_dsa_key.pub = ef:4c:f6:cd:83:88:d1:ad:13:50:f2:af:90:33:e9:70


These are the SHA256 hashes:​
ruby: ssh_host_key.pub = SHA256:685FBToLX5PCXfUoCkDrxosNg7w6L08lDTVsjLiyLQU
ruby: ssh_host_rsa_key.pub = SHA256:D7HjrL4rsYDGagmihFRqy284kAcscqhthYdzT4w0aUo
ruby: ssh_host_dsa_key.pub = SHA256:XplFCsSu7+RDFC6V/1DGt+XXfBjDLk78DNP0crf341U

Supercomputer: 

Owens

TIP: Remember to check the menu to the right of the page for related pages with more information about Owens' specifics.

OSC's Owens cluster being installed in 2016 is a Dell-built, Intel® Xeon® processor-based supercomputer.

Owens infographic,

Hardware

Owens_image

Detailed system specifications:

  • 824 Dell Nodes
  • Dense Compute
    • 648 compute nodes (Dell PowerEdge C6320 two-socket servers with Intel Xeon E5-2680 v4 (Broadwell, 14 cores, 2.40GHz) processors, 128GB memory)

  • GPU Compute

    • 1 60 ‘GPU ready’ compute nodes -- Dell PowerEdge R730 two-socket servers with Intel Xeon E5-2680 v4 (Broadwell, 14 cores, 2.40GHz) processors, 128GB memory

    • NVIDIA Tesla P100 (Pascal) GPUs -- 5.3TF peak (double precision), 16GB memory

  • Analytics

    • 16 huge memory nodes (Dell PowerEdge R930 four-socket server with Intel Xeon E5-4830 v3 (Haswell 12 core, 2.10GHz) processors, 1,536GB memory, 12 x 2TB drives)

  • 23,392 total cores
    • 28 cores/node  & 128GB of memory/node
  • Mellanox EDR (100Gbps) Infiniband networking
  • Theoretical system peak performance
    • ~750 teraflops (CPU only)
  • 4 login nodes:
    • Intel Xeon E5-2680 (Broadwell) CPUs
    • 28 cores/node and 256GB of memory/node

How to Connect

  • SSH Method

To login to Owens at OSC, ssh to the following hostname:

owens.osc.edu 

You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:

ssh <username>@owens.osc.edu

You may see warning message including SSH key fingerprint. Verify that the fingerprint in the message matches one of the SSH key fingerprint listed here, then type yes.

From there, you are connected to Owens login node and have access to the compilers and other software development tools. You can run programs interactively or through batch requests. We use control groups on login nodes to keep the login nodes stable. Please use batch jobs for any compute-intensive or memory-intensive work. See the following sections for details.

  • OnDemand Method

You can also login to Owens at OSC with our OnDemand tool. The first step is to login to OnDemand. Then once logged in you can access Owens by clicking on "Clusters", and then selecting ">_Owens Shell Access".

Instructions on how to connect to OnDemand can be found at the OnDemand documention page.

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley and Ruby clusters. Full details of the storage environment are available in our storage environment guide.

Home directories should be accessed through either the $HOME environment variable or the tilde notation ( ~username ). Project directories are located at /fs/project . Scratch storage is located at /fs/scratch .

Owens will not have symlinks allowing use of the old file system paths. This is in contrast to Oakley and Ruby, which will have the symlinks.

Software Environment

The module system on Owens is the same as on the Oakley and Ruby systems. Use  module load <package>  to add a software package to your environment. Use  module list  to see what modules are currently loaded and  module avail  to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use  module spider . By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded.

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Compiling Code to Use Advanced Vector Extensions (AVX2)

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3 . The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.  Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

See the Owens Programming Environment page for details.

Batch Specifics

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • The qsub syntax for node requests is the same on Owens as on Ruby and Oakley
  • Most compute nodes on Owens have 28 cores/processors per node (ppn).  Huge-memory (analytics) nodes have 48 cores/processors per node.
  • Jobs on Owens may request partial nodes.  This is in contrast to Ruby but similar to Oakley.
  • Owens has 6 debug nodes which are specifically configured for short (< 1 hour) debugging type work.  These nodes have a walltime limit of 1 hour.
    • To schedule a debug node:
      #PBS -l nodes=1:ppn=28 -q debug

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Supercomputer: 
Service: 

Technical Specifications

The following are technical specifications for Owens.  

  Owens SYSTEM (2016)
NUMBER OF NODES 824 nodes
NUMBER OF CPU SOCKETS 1648 (2 sockets/node)
NUMBER OF CPU CORES 23,392 (28 cores/node)
CORES PER NODE 28 cores/node (48 cores/node for Huge Mem Nodes)
LOCAL DISK SPACE PER NODE

~1500GB in /tmp

COMPUTE CPU SPECIFICATIONS

Intel Xeon E5-2680 v4 (Broadwell) for compute

  • 2.4 GHz 
  • 14 cores per processor
COMPUTER SERVER SPECIFICATIONS

648 Dell PowerEdge C6320

160 Dell PowerEdge R730 (for accelerator nodes)

ACCELERATOR SPECIFICATIONS

NVIDIA P100 "Pascal" GPUs 16GB memory

NUMBER OF ACCELERATOR NODES

160 total

TOTAL MEMORY ~ 127 TB
MEMORY PER NODE

128 GB (1.5 TB for Huge Mem Nodes)

MEMORY PER CORE 4.5 GB (31 GB for Huge Mem)
INTERCONNECT  Mellanox EDR Infiniband Networking (100Gbps)
LOGIN SPECIFICATIONS

4 Intel Xeon E5-2680 (Broadwell) CPUs

  • 28 cores/node and 256GB of memory/node
SPECIAL NODES

16 Huge Memory Nodes

  • Dell PowerEdge R930 
  • 4 Intel Xeon E5-4830 v3 (Haswell)
    • 12 Cores
    • 2.1 GHz
  • 48 cores (12 cores/CPU)
  • 1.5 TB Memory
  • 12 x 2 TB Drive (20TB usable)

 

Supercomputer: 
Service: 

Owens Programming Environment

Compilers

C, C++ and Fortran are supported on the Owens cluster. Intel, PGI and GNU compiler suites are available. The Intel development tool chain is loaded by default. Compiler commands and recommended options for serial programs are listed in the table below. See also our compilation guide.

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

In our experience, the Intel and PGI compilers do a much better job than the GNU compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the GNU compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.  Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

LANGUAGE INTEL EXAMPLE PGI EXAMPLE GNU EXAMPLE
C icc -O2 -xHost hello.c pgcc -fast hello.c gcc -O3 -march=native hello.c
Fortran 90 ifort -O2 -xHost hello.f90 pgf90 -fast hello.f90 gfortran -O3 -march=native hello.f90
C++ icpc -O2 -xHost hello.cpp pgc++ -fast hello.cpp g++ -O3 -march=native hello.cpp

Parallel Programming

MPI

OSC systems use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. For more information on building your MPI codes, please visit the MPI Library documentation.

Parallel programs are started with the mpiexec command. For example,

mpiexec ./myprog

The program to be run must either be in your path or have its path specified.

The mpiexec command will normally spawn one MPI process per CPU core requested in a batch job. Use the -n and/or -ppn option to change that behavior.

The table below shows some commonly used options. Use mpiexec -help for more information.

MPIEXEC Option COMMENT
-ppn 1 One process per node
-ppn procs procs processes per node
-n totalprocs
-np totalprocs
At most totalprocs processes per node
-prepend-rank Prepend rank to output
-help Get a list of available options

 

Caution: There are many variations on mpiexec and mpiexec.hydra. Information found on non-OSC websites may not be applicable to our installation.
The information above applies to the MVAPICH2 and IntelMPI installations at OSC. See the OpenMPI software page for mpiexec usage with OpenMPI.

OpenMP

The Intel, PGI and GNU compilers understand the OpenMP set of directives, which support multithreaded programming. For more information on building OpenMP codes on OSC systems, please visit the OpenMP documentation.

GPU Programming

160 Nvidia P100 GPUs are available on Owens.  Please visit our GPU documentation.

Supercomputer: 
Service: 
Technologies: 

Queues and Reservations

Here are the queues available on Owens. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

Name Nodes available max walltime max job size notes

Serial

Available minus reservations

168 hours

1 node

 

Parallel

Available minus reservations

 96 hours

8 nodes

 

Largeparallel

Available minus reservations

96 hours

81 nodes

 

Hugemem

16

96 hours

1 node

 
Parhugemem 16 96 hours 16

Restricted access. 

Use "-q parhugemem" to request it

Debug

6 regular nodes

4 GPU nodes

1 hour 2 nodes

For small interactive and test jobs during 8AM-6PM, Monday - Friday. 

Use "-q debug" to request it 

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if the performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

 

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Owens, please use the following Archival Resource Key:

ark:/19495/hpc6h5b1

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2016. Owens Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc6h5b1

Here is the citation in BibTeX format:

@misc{Owens2016,
ark = {ark:/19495/hpc93fc8},
url = {http://osc.edu/ark:/19495/hpc6h5b1},
year  = {2016},
author = {Ohio Supercomputer Center},
title = {Owens Supercomputer}
}

And in EndNote format:

%0 Generic
%T Owens Supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc6h5b1
%U http://osc.edu/ark:/19495/hpc6h5b1
%D 2016

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 
Service: 

Owens SSH key fingerprints

These are the public key fingerprints for Owens:
owens: ssh_host_rsa_key.pub = 18:68:d4:b0:44:a8:e2:74:59:cc:c8:e3:3a:fa:a5:3f
owens: ssh_host_ed25519_key.pub = 1c:3d:f9:99:79:06:ac:6e:3a:4b:26:81:69:1a:ce:83
owens: ssh_host_ecdsa_key.pub = d6:92:d1:b0:eb:bc:18:86:0c:df:c5:48:29:71:24:af


These are the SHA256 hashes:​
owens: ssh_host_rsa_key.pub = SHA256:vYIOstM2e8xp7WDy5Dua1pt/FxmMJEsHtubqEowOaxo
owens: ssh_host_ed25519_key.pub = SHA256:FSb9ZxUoj5biXhAX85tcJ/+OmTnyFenaSy5ynkRIgV8
owens: ssh_host_ecdsa_key.pub = SHA256:+fqAIqaMW/DUJDB0v/FTxMT9rkbvi/qVdMKVROHmAP4

Supercomputer: 

Migrating jobs from Oakley or Ruby to Owens

This page includes a summary of differences to keep in mind when migrating jobs from Oakley or Ruby to Owens

Guidance for Oakley Users

Hardware Specifications

  Owens (per node) oakley (per node)
Most compute node 28 cores and 125GB of RAM 12 cores and 48GB of RAM
Large memory node    

12 cores and 192GB of RAM

(8 nodes in this class)

Huge memory node

48 cores and 1.5 TB of RAM, 12 x 2TB drives

(16 nodes in this class)

32 cores and 1TB of RAM

(1 node in this class)

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley cluster.

    owens Oakley
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Owens cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Oakley cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Owens uses the same module system as Oakley.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Programming Environment

Like Oakley, Owens supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Owens also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect.

In addition, Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

See the Owens Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Owens. 

On Oakley, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The  qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Owens, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

Accounting

The Owens cluster will charged at a rate of 1 RU per 10 core-hours.

The Oakley cluster will be charged at a rate of 1 RU per 20 core-hours.

Like Oakley, Owens will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests.

Below is a comparison of job limits between Owens and Oakley:

  owens oakley
Per User Up to 256 concurrently running jobs and/or up to 3080 processors/cores in use  Up to 128 concurrently running jobs and/or up to 1500 processors/cores in use
Per group Up to 384 concurrently running jobs and/or up to 3080 processors/cores in use Up to 192 concurrently running jobs and/or up to 1500 processors/cores in use

 

Please see Queues and Reservations for Owens for more details.

Guidance for Ruby Users

Hardware Specifications

  OWENS (PER NODE) Ruby (PER NODE)
Most compute node 28 cores and 125GB of RAM 20 cores and 64GB of RAM
Huge memory node

48 cores and 1.5 TB of RAM, 12 x 2TB drives

(16 nodes in this class)

32 cores and 1TB of RAM 

(1 node in this class)

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Ruby cluster.

    OWENS ruby
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Owens cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Oakley cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Owens uses the same module system as Ruby.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Owens by viewing the Software by System page and selecting the Owens system.

Programming Environment

Like Ruby, Owens supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Owens also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect.

In addition, Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

See the Owens Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Owens. 

On Ruby, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The   qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Owens, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

Accounting

The Owens cluster will charged at a rate of 1 RU per 10 core-hours.

The Ruby cluster will be charged at a rate of 1 RU per 20 core-hours.

However, Owens will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests. By contrast, Ruby only accepts full-node jobs and charge for the whole node. 

Below is a comparison of job limits between Owens and Ruby:

  OWENS Ruby
Per User Up to 256 concurrently running jobs and/or up to 3080 processors/cores in use  Up to 40 concurrently running jobs and/or up to 800 processors/cores in use
Per group Up to 384 concurrently running jobs and/or up to 3080 processors/cores in use Up to 80 concurrently running jobs and/or up to 1600 processors/cores in use

 

Please see Queues and Reservations for Owens for more details.

 

Supercomputer: 
Service: 

Batch Limit Rules

Memory Limit:

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. On Owens, it equates to 4GB/core or 124GB/node.

If your job requests less than a full node ( ppn< 28 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4GB/core).  For example, without any memory request ( mem=XX ), a job that requests  nodes=1:ppn=1  will be assigned one core and should use no more than 4GB of RAM, a job that requests  nodes=1:ppn=3  will be assigned 3 cores and should use no more than 12GB of RAM, and a job that requests  nodes=1:ppn=28  will be assigned the whole node (28 cores) with 124GB of RAM.  

Please be careful if you include memory request (mem=XX ) in your job. A job that requests  nodes=1:ppn=1,mem=12GB  will be assigned one core and have access to 12GB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests  nodes=1:ppn=5,mem=12GB  will be assigned 5 cores but have access to only 12GB of RAM, and charged for 5 cores worth of Resource Units (RU).  See Charging for memory use for more details

A multi-node job ( nodes>1 ) will be assigned the entire nodes with 124 GB/node and charged for the entire nodes regardless of ppn request. For example, a job that requests  nodes=10:ppn=1 will be charged for 10 whole nodes (28 cores/node*10 nodes, which is 280 cores worth of RU).  

A job that requests huge-memory node ( nodes=1:ppn=48  ) will be allocated the entire huge-memory node with 1.5 TB of RAM and charged for the whole node (48 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

Walltime Limit

Here are the queues available on Owens:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

 168 hours

1 node

 

longserial 336 hours 1 node
  • Restricted access (contact OSC Help if you need access)

Parallel

96 hours

8 nodes

Jobs are scheduled to run within a single IB leaf switch

Largeparallel

96 hours

81 nodes

Jobs are scheduled across multiple switches

Hugemem

168 hours

1 node

16 nodes in this class
Parallel hugemem 96 hours 16 nodes
  • Restricted access (contact OSC Help if you need access)
  • Use "-q parhugemem" to access it

Debug

1 hour

2 nodes

  • 6 nodes in this class
  • Use "-q debug" to request it 

GPU Jobs

There is only one GPU per GPU node on Owens.

For serial jobs, we will allow node sharing on GPU nodes so a job may request any number of cores (up to 28)

(nodes=1:ppn=XX:gpus=1)

For parallel jobs (n>1), we will not allow node sharing.

Job/Core Limits

  Max Running Job Limit Soft Max Core/Processor Limit Hard Max Core/Processor Limt
Individual User 256 3080 3080
Project/Group 384 3080 4620

The soft and hard max limits above apply depending on different system resource availability. If resources are scarce, then the soft max limit is used to increase the fairness of allocating resources. Otherwise, if there are idle resources, then the hard max limit is used to increase system utilization.

An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately.
Supercomputer: 
Service: 

Pitzer

TIP: Remember to check the menu to the right of the page for related pages with more information about Pitzer's specifics.

OSC's Pitzer cluster being installed in late 2018 is a Dell-built, Intel® Xeon® processor-based supercomputer.

Hardware

Photo of Pitzer Cluster

Detailed system specifications:

  • 260 Dell Nodes
  • Dense Compute
    • 224 compute nodes (Dell PowerEdge C6420 two-socket servers with Intel Xeon 6148 (Skylake, 20 cores, 2.40GHz) processors, 192GB memory)

  • GPU Compute

    • 32 GPU compute nodes -- Dell PowerEdge R740 two-socket servers with Intel Xeon 6148 (Skylake, 20 cores, 2.40GHz) processors, 384GB memory

    • 2 NVIDIA Volta V100 GPUs -- 16GB memory

  • Analytics

    • 4 huge memory nodes (Dell PowerEdge R940 four-socket server with Intel Xeon 6148 (Skylake 20 core, 2.40GHz) processors, 3TB memory, 2 x 1TB drives mirrored - 1TB usable)

  • 10,560 total cores
    • 40 cores/node & 192GB of memory/node
  • Mellanox EDR (100Gbps) Infiniband networking
  • Theoretical system peak performance
    • 720 TFLOPS (CPU only)
  • 4 login nodes:
    • Intel Xeon 6148 (Skylake) CPUs
    • 40 cores/node and 384GB of memory/node

How to Connect

  • SSH Method

To login to Pitzer at OSC, ssh to the following hostname:

pitzer.osc.edu 

You can either use an ssh client application or execute ssh on the command line in a terminal window as follows:

ssh <username>@pitzer.osc.edu

You may see warning message including SSH key fingerprint. Verify that the fingerprint in the message matches one of the SSH key fingerprint listed here, then type yes.

From there, you are connected to Pitzer login node and have access to the compilers and other software development tools. You can run programs interactively or through batch requests. We use control groups on login nodes to keep the login nodes stable. Please use batch jobs for any compute-intensive or memory-intensive work. See the following sections for details.

  • OnDemand Method

You can also login to Pitzer at OSC with our OnDemand tool. The first step is to login to OnDemand. Then once logged in you can access Pitzer by clicking on "Clusters", and then selecting ">_Pitzer Shell Access".

Instructions on how to connect to OnDemand can be found at the OnDemand documention page.

File Systems

Pitzer accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Owens and Ruby clusters. Full details of the storage environment are available in our storage environment guide.

Home directories should be accessed through either the $HOME environment variable or the tilde notation ( ~username ). Project directories are located at /fs/project . Scratch storage is located at /fs/scratch .

Software Environment

The module system on Pitzer is the same as on the Owens and Ruby systems. Use  module load <package>  to add a software package to your environment. Use  module list  to see what modules are currently loaded and  module avail  to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use  module spider . By default, you will have the batch scheduling software modules, the Intel compiler and an appropriate version of mvapich2 loaded.

You can keep up to on the software packages that have been made available on Pitzer by viewing the Software by System page and selecting the Pitzer system.

Compiling Code to Use Advanced Vector Extensions (AVX2)

The Skylake processors that make up Pitzer support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

In our experience, the Intel and PGI compilers do a much better job than the gnu compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3 . The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Pitzer. The executables will not be portable.  Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

See the Pitzer Programming Environment page for details.

Batch Specifics

Refer to the documentation for our batch environment to understand how to use PBS on OSC hardware. Some specifics you will need to know to create well-formed batch scripts:

  • The qsub syntax for node requests is the same on Pitzer as on Owens and Oakley
  • Most compute nodes on Pitzer have 40 cores/processors per node (ppn).  Huge-memory (analytics) nodes have 80 cores/processors per node.
Due to the ambiguity of requesting a node with 80 cores, one must also request 3TB of memory for the huge memory node job to be accepted by the scheduler. e.g. #PBS -l nodes=1:ppn=80,mem=3000GB
  • Jobs on Pitzer may request partial nodes.  This is in contrast to Ruby but similar to Owens.
  • Pitzer has 6 debug nodes which are specifically configured for short (< 1 hour) debugging type work.  These nodes have a walltime limit of 1 hour.
    • To schedule a debug node:
      #PBS -l nodes=1:ppn=40 -q debug

Using OSC Resources

For more information about how to use OSC resources, please see our guide on batch processing at OSC. For specific information about modules and file storage, please see the Batch Execution Environment page.

Technical Specifications

The following are technical specifications for Pitzer.  

  pitzer SYSTEM (2018)
NUMBER OF NODES 260 nodes
NUMBER OF CPU SOCKETS 528 (2 sockets/node for standard node)
NUMBER OF CPU CORES 10,560 (40 cores/node for standard node)
CORES PER NODE 40 cores/node (80 cores/node for Huge Mem Nodes)
LOCAL DISK SPACE PER NODE

850 GB in /tmp

COMPUTE CPU SPECIFICATIONS

Intel Xeon Gold 6148 (Skylake) for compute

  • 2.4 GHz 
  • 20 cores per processor
COMPUTER SERVER SPECIFICATIONS

224 Dell PowerEdge C6420

32 Dell PowerEdge R740 (for accelerator nodes)

4 Dell PowerEdge R940

ACCELERATOR SPECIFICATIONS

NVIDIA V100 "Volta" GPUs 16GB memory

NUMBER OF ACCELERATOR NODES

32 total (2 GPUs per node)

TOTAL MEMORY ~ 67 TB
MEMORY PER NODE

192 GB for standard nodes

384 GB for accelerator nodes

3 TB for Huge Mem Nodes

MEMORY PER CORE 4.8 GB (76.8 GB for Huge Mem)
INTERCONNECT  Mellanox EDR Infiniband Networking (100Gbps)
LOGIN SPECIFICATIONS

4 Intel Xeon Gold 6148 (Skylake) CPUs

  • 40 cores/node and 384 GB of memory/node
SPECIAL NODES

4 Huge Memory Nodes

  • Dell PowerEdge R940 
  • 4 Intel Xeon Gold 6148 (Skylake)
    • 20 Cores
    • 2.4 GHz
  • 80 cores (20 cores/CPU)
  • 3 TB Memory
  • 2x Mirror 1 TB Drive (1 TB usable)
Supercomputer: 

Pitzer Programming Environment

Compilers

C, C++ and Fortran are supported on the Pitzer cluster. Intel, PGI and GNU compiler suites are available. The Intel development tool chain is loaded by default. Compiler commands and recommended options for serial programs are listed in the table below. See also our compilation guide.

The Skylake processors that make up Pitzer support the Advanced Vector Extensions (AVX512) instruction set, but you must set the correct compiler flags to take advantage of it. AVX512 has the potential to speed up your code by a factor of 8 or more, depending on the compiler and options you would otherwise use. However, bare in mind that clock speeds decrease as the level of the instruction set increases. So, if your code does not benefit from vectorization it may be beneficial to use a lower instruction set.

In our experience, the Intel and PGI compilers do a much better job than the GNU compilers at optimizing HPC code.

With the Intel compilers, use -xHost and -O2 or higher. With the GNU compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.  Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

LANGUAGE INTEL EXAMPLE PGI EXAMPLE GNU EXAMPLE
C icc -O2 -xHost hello.c pgcc -fast hello.c gcc -O3 -march=native hello.c
Fortran 90 ifort -O2 -xHost hello.f90 pgf90 -fast hello.f90 gfortran -O3 -march=native hello.f90
C++ icpc -O2 -xHost hello.cpp pgc++ -fast hello.cpp g++ -O3 -march=native hello.cpp

Parallel Programming

MPI

OSC systems use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. MPI is a standard library for performing parallel processing using a distributed-memory model. For more information on building your MPI codes, please visit the MPI Library documentation.

Parallel programs are started with the mpiexec command. For example,

mpiexec ./myprog

The program to be run must either be in your path or have its path specified.

The mpiexec command will normally spawn one MPI process per CPU core requested in a batch job. Use the -n and/or -ppn option to change that behavior.

The table below shows some commonly used options. Use mpiexec -help for more information.

MPIEXEC OPTION COMMENT
-ppn 1 One process per node
-ppn procs procs processes per node
-n totalprocs
-np totalprocs
At most totalprocs processes per node
-prepend-rank Prepend rank to output
-help Get a list of available options

 

Caution: There are many variations on mpiexec and mpiexec.hydra. Information found on non-OSC websites may not be applicable to our installation.
The information above applies to the MVAPICH2 and IntelMPI installations at OSC. See the OpenMPI software page for mpiexec usage with OpenMPI.

OpenMP

The Intel, PGI and GNU compilers understand the OpenMP set of directives, which support multithreaded programming. For more information on building OpenMP codes on OSC systems, please visit the OpenMP documentation.

 

Process/Thread placement

Processes and threads are placed differently depending on the compiler and MPI implementation used to compile your code. This section summarizes the default behavior and how to modify placement.

For all three compilers (Intel, GNU, PGI), purely threaded codes do not bind to particular cores by default.

For MPI-only codes, Intel MPI first binds the first half of processes to one socket, and then second half to the second socket so that consecutive tasks are located near each other. MVAPICH2 first binds as many processes as possible on one socket, then allocates the remaining processes on the second socket so that consecutive tasks are near each other. OpenMPI alternately binds processes on socket 1, socket 2, socket 1, socket 2, etc, with no particular order for the core id.

For Hybrid codes, Intel MPI first binds the first half of processes to one socket, and then second half to the second socket so that consecutive tasks are located near each other. Each process is allocated ${OMP_NUM_THREADS} cores and the threads of each process are bound to those cores. MVAPICH2 allocates ${OMP_NUM_THREADS} cores for each process and each thread of a process is placed on a separate core. By default, OpenMPI  behaves the same for hybrid codes as it does for MPI-only codes, allocating a single core for each process and all threads of that process.

The following tables describe how to modify the default placements for each type of code.

OpenMP options:

Option Intel GNU Pgi description
Scatter KMP_AFFINITY=scatter OMP_PLACES=cores OMP_PROC_BIND=close/spread MP_BIND=yes Distribute threads as evenly as possible across system
Compact KMP_AFFINITY=compact OMP_PLACES=sockets MP_BIND=yes MP_BLIST="0,2,4,6,8,10,1,3,5,7,9" Place threads as closely as possible on system

 

MPI options:

OPTION INTEL MVAPICh2 openmpi DESCRIPTION
Scatter I_MPI_PIN_DOMAIN=core I_MPI_PIN_ORDER=scatter MV2_CPU_BINDING_POLICY=scatter -map-by core --rank-by socket:span Distribute processes as evenly as possible across system
Compact I_MPI_PIN_DOMAIN=core I_MPI_PIN_ORDER=compact MV2_CPU_BINDING_POLICY=bunch -map-by core

Distribute processes as closely as possible on system

 

Hybrid MPI+OpenMP options (combine with options from OpenMP table for thread affinity within cores allocated to each process):

OPTION INTEL MVAPICH2 OPENMPI DESCRIPTION
Scatter I_MPI_PIN_DOMAIN=omp I_MPI_PIN_ORDER=scatter MV2_CPU_BINDING_POLICY=hybrid MV2_HYBRID_BINDING_POLICY=linear -map-by node:PE=$OMP_NUM_THREADS --bind-to core --rank-by socket:span Distrubute processes as evenly as possible across system ($OMP_NUM_THREADS cores per process)
Compact I_MPI_PIN_DOMAIN=omp I_MPI_PIN_ORDER=compact MV2_CPU_BINDING_POLICY=hybrid MV2_HYBRID_BINDING_POLICY=spread -map-by node:PE=$OMP_NUM_THREADS --bind-to core Distribute processes as closely as possible on system ($OMP_NUM_THREADS cores per process)

 

 

The above tables list the most commonly used settings for process/thread placement. Some compilers and Intel libraries may have additional options for process and thread placement beyond those mentioned on this page. For more information on a specific compiler/library, check the more detailed documentation for that library.

GPU Programming

64 Nvidia V100 GPUs are available on Pitzer.  Please visit our GPU documentation.

 
 
Supercomputer: 

Queues and Reservations

Here are the queues available on Pitzer. Please note that you will be routed to the appropriate queue based on your walltime and job size request.

NAME MAX WALLTIME NODES AVAILABLE MIN JOB SIZE MAX JOB SIZE NOTES
Serial 168 hours Available minus reservations 1 core 1 node  
Longserial 336 hours Available minus reservations 1 core 1 node Restricted access
Parallel 96 hours Available minus reservations 2 nodes 40 nodes   
Longparallel TBD Available minus reservations 2 nodes TBD Restricted access
Hugemem 168 hours 4 nodes 1 node 1 node  
Parallel hugemem TBD 4 nodes 2 nodes 4 nodes Do not support for now
Debug-regular 1 hour 6 nodes 1 core 2 nodes -q debug
Debug-GPU 1 hour 2 nodes 1 core 2 nodes -q debug

"Available minus reservations" means all nodes in the cluster currently operational (this will fluctuate slightly), less the reservations listed below. To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if the performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

Occasionally, reservations will be created for specific projects that will not be reflected in these tables.

Supercomputer: 
Service: 

Batch Limit Rules

Memory Limit:

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. See Charging for memory use for more details

Regular Compute Node

For regular compute node, the physical memory equates to 4.8 GB/core or 192 GB/node; while the usable memory equates to 4761 MB/core or 183 GB/node. See Changes of Default Memory Limits for more discussions. 

If your job requests less than a full node ( ppn < 40  ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4761 MB/core).  For example, without any memory request ( mem=XX ), a job that requests nodes=1:ppn=1 will be assigned one core and should use no more than 4761 MB of RAM, a job that requests nodes=1:ppn=3 will be assigned 3 cores and should use no more than 14283 MB of RAM, and a job that requests  nodes=1:ppn=40 will be assigned the whole node (40 cores).  

Please be careful if you include memory request ( mem=XX ) in your job. A job that requests   nodes=1:ppn=1,mem=14283mb  will be assigned one core and have access to 14283 MB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests   nodes=1:ppn=5,mem=14283mB   will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of Resource Units (RU).  

A multi-node job ( nodes > 1 ) will be assigned the entire nodes and charged for the entire nodes regardless of ppn request. For example, a job that requests  nodes=10:ppn=1 will be charged for 10 whole nodes (40 cores/node*10 nodes, which is 400 cores worth of RU).  

GPU Node

For GPU node, the physical memory equates to 9.6 GB/core or 384 GB/node; while the memory used by the submit filter equates to 4761 MB/core or 374 GB/node.

For any job that requests more than 183 GB/node but no more than 374 GB/node, the job will be scheduled on the GPU node (which is called 'largemem' queue). 

Huge Memory Node

Node sharing is not allowed for huge memory node. A job that requests huge-memory node ( nodes=1:ppn=80,mem=3000GB) will be allocated the entire huge-memory node with 3019 GB of RAM and charged for the whole node (80 cores worth of RU).

Due to the ambiguity of requesting a node with 80 cores, one must also request 3TB of memory for the huge memory node job to be accepted by the scheduler.

Summary

In summary, for serial jobs, we will allocate the resources considering both the ppn and memory request if requesting a regular compute or GPU node. For parallel jobs (n>1) or huge memory jobs, we will allocate the entire nodes with the whole memory regardless of ppn request. Below is the summary of the physical and usable memory of different types of nodes on Pitzer. To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage

 
Type of node   Physical Memory Usable Memory
Regular compute Per core 4.8 GB 4761 MB
  Per node 192 GB (40 cores) 183 GB
GPU Per core 9.6 GB 4761 MB
  Per node 384 GB (40 cores) 374 GB
Huge memory Per core 37.5 GB n/a
  Per node 3 TB (80 cores) 3019 GB

GPU Jobs

There are 2 GPUs per node on Pitzer.

For serial jobs, we will allow node sharing on GPU nodes so a job may request any number of cores (up to 40) and either 1 or 2 GPUs ( nodes=1:ppn=XX: gpus=1 or gpus=2 )

For parallel jobs (n>1), we will not allow node sharing. A job may request 1 or 2 GPUs ( gpus=1 or gpus=2 ) but both GPUs will be allocated to the job.

Walltime Limit

Here are the queues available on Pitzer:

Name Max walltime nodes available min job size max job size notes
Serial 168 hours Available minus reservations 1 core 1 node  
Longserial 336 hours Available minus reservations 1 core 1 node Restricted access
Parallel 96 hours Available minus reservations 2 nodes 40 nodes   
Longparallel TBD Available minus reservations 2 nodes TBD Restricted access
Hugemem 168 hours 4 nodes 1 node 1 node  
Parallel hugemem TBD 4 nodes 2 nodes 4 nodes Do not support for now
Debug-regular 1 hour 6 nodes 1 core 2 nodes -q debug
Debug-GPU 1 hour 2 nodes 1 core 2 nodes -q debug

 

Job/Core Limits

  Soft Max Running Job limit Hard Max Running Job Limit Max Core Limit
Individual User 128 256 2040
Project/Group 192 384 2040

The soft and hard max limits above apply depending on different system resource availability. If resources are scarce, then the soft max limit is used to increase the fairness of allocating resources. Otherwise, if there are idle resources, then the hard max limit is used to increase system utilization.

An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use. 

However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

2040 cores equates to 51 nodes or ~22% of the whole system)
A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately.
Supercomputer: 
Service: 

Citation

For more information about citations of OSC, visit https://www.osc.edu/citation.

To cite Pitzer, please use the following Archival Resource Key:

ark:/19495/hpc56htp

Please adjust this citation to fit the citation style guidelines required.

Ohio Supercomputer Center. 2018. Pitzer Supercomputer. Columbus, OH: Ohio Supercomputer Center. http://osc.edu/ark:19495/hpc56htp

Here is the citation in BibTeX format:

@misc{Pitzer2018,
ark = {ark:/19495/hpc56htp},
url = {http://osc.edu/ark:/19495/hpc56htp},
year  = {2018},
author = {Ohio Supercomputer Center},
title = {Pitzer Supercomputer}
}

And in EndNote format:

%0 Generic
%T Pitzer Supercomputer
%A Ohio Supercomputer Center
%R ark:/19495/hpc56htp
%U http://osc.edu/ark:/19495/hpc56htp
%D 2018

Here is an .ris file to better suit your needs. Please change the import option to .ris.

Documentation Attachment: 
Supercomputer: 

Pitzer SSH key fingerprints

These are the public key fingerprints for Pitzer:
pitzer: ssh_host_rsa_key.pub = 8c:8a:1f:67:a0:e8:77:d5:4e:3b:79:5e:e8:43:49:0e 
pitzer: ssh_host_ed25519_key.pub = 6d:19:73:8e:b4:61:09:a9:e6:0f:e5:0d:e5:cb:59:0b 
pitzer: ssh_host_ecdsa_key.pub = 6f:c7:d0:f9:08:78:97:b8:23:2e:0d:e2:63:e7:ac:93 


These are the SHA256 hashes:​
pitzer: ssh_host_rsa_key.pub = SHA256:oWBf+YmIzwIp+DsyuvB4loGrpi2ecow9fnZKNZgEVHc 
pitzer: ssh_host_ed25519_key.pub = SHA256:zUgn1K3+FK+25JtG6oFI9hVZjVxty1xEqw/K7DEwZdc 
pitzer: ssh_host_ecdsa_key.pub = SHA256:8XAn/GbQ0nbGONUmlNQJenMuY5r3x7ynjnzLt+k+W1M 

Supercomputer: 

Migrating jobs from other clusters

This page includes a summary of differences to keep in mind when migrating jobs from other clusters to Pitzer. 

Guidance for Oakley Users

Hardware Specifications

  Pitzer (per node) oakley (per node)
Most compute node 40 cores and 192GB of RAM 12 cores and 48GB of RAM
Large memory node    

12 cores and 192GB of RAM

(8 nodes in this class)

Huge memory node

80 cores and 3.0 TB of RAM

(4 nodes in this class)

32 cores and 1.0TB of RAM

(1 node in this class)

File Systems

Owens accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Oakley cluster.

    pitzer Oakley
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Pitzer cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Oakley cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Pitzer uses the same module system as Oakley.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Pitzer by viewing the Software by System page and selecting the Pitzer system.

Programming Environment

Like Oakley, Pitzer supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Pitzer also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. In addition, Pitzer support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. 

See the Pitzer Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Pitzer. 

On Oakley, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The  qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Pitzer, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

In addition, preemption job and hyper-threading job are supported on Pitzer. See this page for more information. 

Accounting

The Pitzer cluster will be charged at a rate of 1 RU per 10 core-hours.

The Oakley cluster will be charged at a rate of 1 RU per 20 core-hours.

Like Oakley, Pitzer will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests.

Below is a comparison of job limits between Pitzer and Oakley:

  Pitzer oakley
Per User Up to 128 concurrently running jobs and/or up to 2040 processors/cores in use  Up to 256 concurrently running jobs and/or up to 2040 processors/cores in use
Per group Up to 192 concurrently running jobs and/or up to 2040 processors/cores in use Up to 384 concurrently running jobs and/or up to 2040 processors/cores in use

Please see Queues and Reservations for Pitzer and Batch Limit Rules for more details.

Guidance for Owens Users

Hardware Specifications

  pitzer (PER NODE) owens (PER NODE)
Most compute node 40 cores and 192GB of RAM 28 cores and 125GB of RAM
Huge memory node

80 cores and 3.0 TB of RAM

(4 nodes in this class)

48 cores and 1.5TB of RAM, 12 x 2TB drives

(16 node in this class)

File Systems

Pitzer accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory, project space, and scratch space as on the Owens cluster.

Software Environment

Pitzer uses the same module system as Owens.

Use   module load <package to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider 

You can keep up to on the software packages that have been made available on Pitzer by viewing the Software by System page and selecting the Pitzer system.

Programming Environment

Like Owens, Pitzer supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi

Pitzer also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect and support the Advanced Vector Extensions (AVX2) instruction set.

See the Pitzer Programming Environment page for details.

PBS Batch-Related Command

Like on Owens, a job’s stdout and stderr data streams on Pitzer are written to log files stored on the current working directory, i.e. $PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

In addition, preemption job and hyper-threading job are supported on Pitzer. See this page for more information. 

Accounting

The same as on Owens, the Pitzer cluster will charged at a rate of 1 RU per 10 core-hours. Below is a comparison of job limits between Pitzer and Owens:

  PItzer Owens
Per User Up to 128 concurrently running jobs and/or up to 2040 processors/cores in use  Up to 256 concurrently running jobs and/or up to 3080 processors/cores in use
Per group Up to 192 concurrently running jobs and/or up to 2040 processors/cores in use Up to 384 concurrently running jobs and/or up to 4620 processors/cores in use

Please see Queues and Reservations for Pitzer and Batch Limit Rules for more details.

 

Guidance for Ruby Users

Hardware Specifications

  pitzer (PER NODE) RUBY (PER NODE)
Most compute node 40 cores and 192GB of RAM 20 cores and 64GB of RAM
Huge memory node

80 cores and 3.0 TB of RAM

(4 nodes in this class)

32 cores and 1TB of RAM 

(1 node in this class)

File Systems

Pitzer accesses the same OSC mass storage environment as our other clusters. Therefore, users have the same home directory as on the Ruby cluster.

    pitzer RUBY
Home directories Accessed through either the  $HOME  environment variable or the tilde notation ( ~username )

Do NOT have symlinks allowing use of the old file system paths.

Please modify your script with the new paths before you submit jobs to Pitzer cluster

 

 

Have the symlinks allowing use of the old file system paths. 

No action is required on your part to continue using your existing job scripts on Ruby cluster

 

 

 

Project directories Located at  /fs/project
Scratch storage Located at  /fs/scratch

See the 2016 Storage Service Upgrades page for details. 

Software Environment

Pitzer uses the same module system as Ruby.

Use   module load <package>   to add a software package to your environment. Use   module list   to see what modules are currently loaded and  module avail   to see the modules that are available to load. To search for modules that may not be visible due to dependencies or conflicts, use   module spider  

You can keep up to on the software packages that have been made available on Pitzer by viewing the Software by System page and selecting the Pitzer system.

Programming Environment

Like Ruby, Pitzer supports three compilers: Intel, PGI, and gnu. The default is Intel. To switch to a different compiler, use  module swap intel gnu  or  module swap intel pgi 

Pitzer also use the MVAPICH2 implementation of the Message Passing Interface (MPI), optimized for the high-speed Infiniband interconnect. In addition, Pitzer support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. 

See the Pitzer Programming Environment page for details.

PBS Batch-Related Command

qpeek Command is not needed on Pitzer.  

On Ruby, a job’s stdout and stderr data streams, which normally show up on the screen, are written to log files. These log files are stored on a server until the job ends, so you can’t look at them directly. The   qpeek  command allows you to peek at their contents. If you used the PBS header line to join the stdout and stderr streams ( #PBS -j oe ), the two streams are combined in the output log.

On Pitzer, a job’s stdout and stderr data streams are written to log files stored on the current working directory, i.e.$PBS_O_WORKDIR . You will see the log files immediately after your job get started. 

In addition, preemption job and hyper-threading job are supported on Pitzer. See this page for more information. 

Accounting

The Pitzer cluster will charged at a rate of 1 RU per 10 core-hours.

The Ruby cluster will be charged at a rate of 1 RU per 20 core-hours.

However, Pitzer will accept partial-node jobs and charge you for the number of cores proportional to the amount of memory your job requests. By contrast, Ruby only accepts full-node jobs and charge for the whole node. 

Below is a comparison of job limits between Pitzer and Ruby:

  Pitzer RUBY
Per User Up to 128 concurrently running jobs and/or up to 2040 processors/cores in use  Up to 40 concurrently running jobs and/or up to 800 processors/cores in use
Per group Up to 192 concurrently running jobs and/or up to 2040 processors/cores in use Up to 80 concurrently running jobs and/or up to 1600 processors/cores in use

Please see Queues and Reservations for Pitzer and Batch Limit Rules for more details.

 
Supercomputer: 
Service: 

Special Schedulings

Preemption Job

Preemption job is available in general using a new QOS: preemptible.

Jobs that request this QOS will be eligible to be run on reserved condo nodes. However, if other jobs with the appropriate QOS are waiting on those same resources, the preemptible jobs are killed to allow the higher QOS jobs start. Preemption will be done with a minimum runtime before preemption of 15 minutes. Preemptible jobs will be charged at the same rate.

The preemptable job will be treated as the regular job considering the ‘Job Size Policy’ (# of jobs and # of cores)

 

Supercomputer: 
Service: 

GPU Computing

OSC offers GPU computing on all its systems.  While GPUs can provide a significant boost in performance for some applications the computing model is very different from the CPU.  This page will discuss some of the ways you can use GPU computing at OSC.

Accessing GPU Resources

To request nodes with a GPU add the gpus=# attribute to the PBS nodes directive in your batch script, for example, on Owens,

#PBS -l nodes=2:ppn=28:gpus=1

In most cases you'll need to load the cuda module (module load cuda) to make the necessary Nvidia libraries available. 

There is no additional RU charge for GPUs.

Setting the GPU compute mode (optional)

The GPUs on Owens and Pitzer can be set to different compute modes as listed here.   They can be set by adding the following to the GPU specification: 

-l nodes=1:ppn=28:gpus=1:default
-l nodes=1:ppn=28:gpus=1:exclusive_process

The compute mode exclusive_process is the default on GPU nodes if a compute mode is not specified. With this mode,  mulitple CUDA processes across GPU nodes are not allowed, e.g CUDA processes via MPI.  If you need to run a MPI-CUDA job, please set the compute mode to  default

Using GPU-enabled Applications

We have several supported applications that can use GPUs.  This includes

Please see the software pages for each application.  They have different levels of support for multi-node jobs, cpu/gpu work sharing, and environment set-up.

Libraries with GPU Support

There are a few libraries that provide GPU implementations of commonly used routines. While they mostly hide the details of using a GPU there are still some GPU specifics you'll need to be aware of, e.g. device initialization, threading, and memory allocation.

MAGMA

MAGMA is an implementation of BLAS and LAPACK with multi-core (SMP) and GPU support. There are some differences in the API of standard BLAS and LAPACK.

cuBLAS and cuSPARSE

cuBLAS is a highly optimized BLAS from NVIDIA. There are a few versions of this library, from very GPU-specific to nearly transparent. cuSPARSE is a BLAS-like library for sparse matrices.

The MAGMA library is built on cuBLAS.

cuFFT

cuFFT is NVIDIA's Fourier transform library with an API similar to FFTW.

cuDNN

cuDNN is NVIDIA's Deep Neural Network machine learning library. Many ML applications are built on cuDNN.

Direct GPU Programming

GPUs present a different programming model from CPUs so there is a significant time investment in going this route.

OpenACC

OpenACC is a directives-based model similar to OpenMP. Currently this is only supported by the Portland Group C/C++ and Fortran compilers.

OpenCL

OpenCL is a set of libraries and C/C++ compiler extensions supporting GPUs (NVIDIA and AMD) and other hardware accelerators. The CUDA module provides an OpenCL library.

CUDA

CUDA is the standard NVIDIA development environment. In this model explicit GPU code is written in the CUDA C/C++ dialect, compiled with the CUDA compiler NVCC, and linked with a native driver program.

About OSC GPU Hardware

Our GPUs span several generations with different capabilites and ease-of-use. Many of the differences won't be visible when using applications or libraries, but some features and applications may not be supported on the older models.

 

Ruby K40

The K40 "Tesla" has a compute capability of 3.5, which is supported by most applications.

Each K40 has 12GB of memory and there is one GPU per GPU node.

Owens P100

The P100 "Pascal" is a NVIDIA GPU with a compute capability of 6.0. The 6.0 capability includes unified shared CPU/GPU memory -- the GPU now has its own virtual memory capability and can map CPU memory into its address space.

Each P100 has 16GB of on-board memory and there is one GPU per GPU node.

Pitzer V100

The V100 "Volta" is NVIDIA's flagship GPU with a compute capability of 7.0.

Each V100 has 16GB of memory and there are two GPUs per GPU node.

Examples

There are example jobs and code at GitHub

Tutorials & Training

Training is an important part of our services. We are working to expand our portfolio; we currently provide the following:

  • Training classes. OSC provides training classes, at our facility, on-site and remotely.
  • HOWTOs. Step-by-step guides to accomplish certain tasks on our systems.
  • Tutorials. Online content designed for self-paced learning.

Other good sources for information:

  • Knowledge Base.  Useful information that does not fit our existing documentation.
  • FAQ.  List of commonly asked questions.

Knowledge Base

This knowledge base is a collection of important, useful information about OSC systems that does not fit into a guide or tutorial, and is too long to be answered in a simple FAQ.

Account Consolidation Guide

Account consolidation will take place during the July 17th downtime. Jobs submitted by non-preferred accounts that do not run prior to the downtime will be deleted, as the jobs will fail after the downtime. Please be aware of this, and be prepared to resubmit your held jobs from your consolidated account with the correct charge code.
Please contact OSC Help if you need further information. 

Single Account / Multiple Projects

If you work with several research groups, you have a separate account for each group. This means multiple home directories, multiple passwords, etc. Over the years there have been requests for a single login system. We're now putting that in place.

How will this affect you?

If you have only one account, you'll see no changes. But if you work with multiple groups in the future, you'll need to be aware of how this works.

  • All users with multiple accounts have been granted a preferred username that they will use to log into our systems. These were communicated to impacted clients on July 11.
  • It will be very important to use the correct project code for batch job charging.
  • Managing the sharing of files between your projects (groups) will be a little more complicated.
  • In most cases, you will only need to fill out software license agreements once.

The single username 

We requested those with multiple accounts to choose a preferred username. If one was not selected by the user, we selected one for them. 

The preferred username will be your only active account; you will not be able to log in or submit jobs with the other accounts. 

Checking the groups of a username

To check all groups of a username (USERID), use the command:

id -Gn USERID

The first one from the output is your primary group, which is the project code (PROJECTID) this username (USERID) was created under.

To check all project codes your user account is under, use the command

id -Gn USERID | awk -F '[ ]' '{for (i=1;i<=NF;i++){if ($i ~/^P/) {print $i}}}'

Changing the primary group for a login session

You can change the primary group of your username (USERID) to any UNIX group (GROUP) that username (USERID) belongs to during the login session using the command:

newgrp GROUP

This change is only valid during this login session. If you log out and log back in, your primary group is changed back to the default one.

Check previous user accounts

There is no available tool to check all of your previous active accounts. We sent an email to each impacted user providing the information on your preferred username and previous accounts. Please refer to that email (sent on July 11, subject "Multiple OSC Accounts - Your Single Username").

Batch job

How to specify the charging project

If you work with an annual allocation project, you're already familiar with the '#PBS -A' option for charging against a specific project.  

It will be very important that you make sure a batch job is charged against the correct research project code, including annual allocations. While the system can ensure you don't inadvertently charge an unrelated project, we can't make sure that you're charging the correct project.

Note:
At times, PIs would ask us to check that group members were correctly charging an annual allocation. We could do that by comparing the primary group with the charged code. With single username multiple groups (project codes), we will not be able to do this kind of check.

Batch limits policy

The job limit per user remains the same. That is to say, though your jobs are charged against different project codes, the total number of jobs and cores your user account can use on each system is still restricted by the previous user-based limit. Therefore, consolidating multiple user accounts into one preferred user account may affect the work of some users.

Please check our batch limit policy on each system for more details.

Data Management

Managing multiple home directories

Data from your non-preferred accounts will remain in those home directories; the ownership of the files will be updated to your preferred username, the newly consolidated account. You can access your other home directories using the command cd /absolute/path/to/file

You will need to consolidate all files to your preferred username as soon as possible because we plan to purge the data in future. Please contact OSC Help if you need the information on your other home directories to access the files.  

Previous files associated with your other usernames

  • Files associated with your non-preferred accounts will have their ownership changed to your preferred username. 
  • These files won't count against your home directory file quota. 
  • There will be no change to files and quotas on the project and scratch file systems.

Change group of a file

Log in with preferred username (P_ USERID) and create a new file of which the owner and group is your preferred username (P_ USERID) and primary project code (P_PROJECTID). Then change the group of the newly created file (FILE) using the command:

chgrp PROJECTID FILE

Managing file sharing in a batch job

In the Linux file system, every file has an owner and a group. By default, the group (project code) assigned to a file is the primary group of the user who creates it. This means that even if you change the charged account for a batch job, any files created will still be associated with your primary group.

To change the group for new files you will need to include:

-W group_list=[group-id]

at the beginning of your batch job. If you are a member of research groups PQR1234 (primary) and PQR5678 then batch jobs run for PQR5678 will typically have:

#PBS -A PQR5678
#PBS -W group_list=PQR5678

It is important to remember that groups are used in two different ways: for resource use charging and file permissions. In the simplest case, if you are a member of only one research group/project, you won't need either option above. If you are in multiple research groups and/or have annual allocations, you may need something like:

#PBS -A PAA1111
#PBS -W group_list=PQR5678

OnDemand users

If you use the OnDemand Files app to upload files to the OSC filesystem, the group ownership of uploaded files will be your primary group.

Software licenses

  • We will merge all your current agreements if you have multiple accounts.  
  • In many cases, you will only need to fill out software license agreements once.
  • Some vendors may require you to sign an updated agreement.  
  • Some vendors may also require the PI of each of your research groups/project codes to sign an agreement.
Supercomputer: 

Changes of Default Memory Limits

Problem Description

Our current GPFS file system is a distributed process with significant interactions between the clients. As the compute nodes being GPFS flle system clients, a certain amount of memory of each node needs to be reserved for these interactions. As a result, the maximum physical memory of each node allowed to be used by users' jobs are reduced, in order to keep the healthy performance of the file system. In addition, using swap memory is not allowed anymore. 

The table below summarizes the maximum physical memory allowed for each type of nodes on our systems:

Ruby Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 64GB 61GB
Debug node 128GB 124GB
Huge memory node 1024GB (1TB) 1008GB

Owens Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 128GB 124GB
Huge memory node 1536GB

1510GB

Pitzer Cluster

Node type physical memory per node Maximum memory allowed per Node 
Regular node 192GB 183GB
GPU node 384GB 374GB
Huge memory node 3TB 2019GB

Solutions When You Need Regular Nodes

Starting from October 27, 2016, we'll implement a new scheduling policy on all of our clusters, reflecting the reduced default memory limits. 

If you do not request memory explicitly in your job (no -l mem

Your job can be submitted and scheduled as before, and resouces will be allocated according to your requests of cores/nodes ( nodes=XX:ppn=XX ).  If you request partial node, the memory allocated to your job is proportional to the number of cores requested (4GB/core on Owens, 4761MB/core on Pitzer); if you request the whole node, the memory allocated to your job is decreased, following the information summarized in the above tables. Some examples are provided below.

A request of partial node:

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.  

On Owens, a request of   nodes=1:ppn=1    will be allocated with 4GB memory, and charged for 1 core. A request of   nodes=1:ppn=4   will be allocated with 16GB memory, and charged for 4 cores.

On Pitzer, a request of  nodes=1:ppn=1   will be allocated with 4761MB memory, and charged for 1 core.  A request of   nodes=1:ppn=4   will be allocated with 19044MB memory, and charged for 4 cores.  

 

A request of the whole node:

A request of the whole regular node will be allocated with maximum memory allowed per node and charged for the whole node, as summarized below:

  Request memory allocated charged for
Ruby nodes=1:ppn=20  61GB 20 cores
Owens nodes=1:ppn=28 124GB 28 cores
Pitzer nodes=1:ppn=40 183GB 40 cores

A request of multiple nodes:

If you have a multi-node job (  nodes>1  ), your job will be assigned the entire nodes with maximum memory allowed per node (61GB for Ruby, 124GB for Owens, and 183GB for Pitzer) and charged for the entire nodes regardless of ppn request.

If you do request memory explicitly in your job (with  -l mem 

If you request memory explicily in your scirpt, please re-visit your script according to the following information. 

A request of partial node:

On Owens, a request of  nodes=1:ppn=1, mem=4gb   will be allocated with 4GB memory, and charged for 1 core.

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.

On Pitzer, a job that requests    nodes=1:ppn=1,mem=14283mb   will be assigned one core and have access to 14283 MB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests   nodes=1:ppn=5,mem=14283mB    will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of Resource Units (RU). 

 A request of the whole node:

On Ruby, the maximum value you can use for  -l mem  is  61gb , i.e.  -l mem=61gb . A request of   nodes=1:ppn=20,mem=61gb  will be allocated with 61GB memory, and charged for the whole node. If you need more than 61GB memory for the job, please submit your job to huge memory nodes on Ruby, or switch to Owens cluster. Any request requesting  mem>61gb  will not be scheduled. 

On Owens, the maximum value you can use for -l mem is 125gb , i.e. -l mem=125gb . A request of   nodes=1:ppn=28,mem=124gb  will be allocated with 124GB memory, and charged for the whole node. If you need more than 124GB memory for the job, please submit your job to huge memory nodes, or switch to Pitzer cluster. Any request requesting  mem=>126gb  will not be scheduled. 

On Pitzer, the maximum value you can use  for   -l mem    is    183gb  ,   i.e.   -l mem=183gb  .  A request  of    nodes=1:ppn=40,mem=183gb   will   be  allocated with 183GB memory, and charged for the whole node. If you need more than 183GB memory for the job, please submit your job to huge memory nodes on Owens or Pitzer. Any request  requesting   mem>183gb   may  be re-scheduled on huge memory node on Pitzer, or will not be scheduled, depending on what you put in the request.

A request of multiple nodes:

If you have a multi-node job (   nodes>1 ), your job will be assigned the entire nodes with maximum memory allowed per node (61GB for Ruby, 124GB for Owens, and 183GB for Pitzer) and charged for the entire nodes.

Solutions When You Need Special Nodes

It is highly recommended that you do not put any memory request and follow the syntax below if you need any special resources.

Ruby Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Debug node nodes=1:ppn=16 -q debug 124GB 16 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Owens Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=48 1510GB 48 cores

 Pitzer Cluster:

node type how to request MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=80,mem=3000gb 3019GB 80 cores

 

Supercomputer: 

Compilation Guide

As a general recommendation, we suggest selecting the newest compilers available for a new project. For repeatability, you may not want to change compilers in the middle of an experiment.

Pitzer Compilers

The Skylake processors that make up Pitzer support the AVX512 instruction set, but you must set the correct compiler flags to take advantage of it. AVX512 has the potential to speed up your code by a factor of 8 or more, depending on the compiler and options you would otherwise use.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Pitzer. The executables will not be portable.  Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The   -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above:  -qopenmp  

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran   or   pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx

Recommended Optimization Options

The   -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above:  -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The  -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above:  -fopenmp

 

Owens Compilers

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable. Of course, any highly optimized builds, such as those employing the options above, should be thoroughly validated for correctness.

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The   -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above:  -qopenmp  or  -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran   or   pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx

Recommended Optimization Options

The   -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above:  -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The  -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above:  -fopenmp

 

Ruby Compilers

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The  -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above: -qopenmp or -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran  or  pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx
NOTE: The C++ compiler used to be pgCC, but newer versions of PGI do not support this name.

Recommended Optimization Options

The  -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above: -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above: -fopenmp

 

Further Reading:

Intel Compiler Page

PGI Compiler Page

GNU Complier Page

Supercomputer: 
Technologies: 
Fields of Science: 

Firewall and Proxy Settings

Connections to OSC

In order for users to access OSC resources through the web your firewall rules should allow for connections to the following publicly-facing IP ranges.  Otherwise, users may be blocked or denied access to our services.

  • 192.148.248.0/24
  • 192.148.247.0/24
  • 192.157.5.0/25

The followingg TCP ports should be opened:

  • 80 (HTTP)
  • 443 (HTTPS)
  • 22 (SSH)

The following domain should be allowed:

  • *.osc.edu

Users may follow the instructions below "Test your configuration" to ensure that your system is not blocked from accessing our services. If you are still unsure of whether their network is blocking theses hosts or ports should contact their local IT administrator.

Test your configuration

[Windows] Test your connection using PuTTY

  1. Open the PuTTY application.
  2. Enter IP address listed in "Connections to OSC" in the "Host Name" field.
  3. Enter 22 in the "Port" field.
  4. Click the 'Telnet' radio button under "Connection Type".
  5. Click "Open" to test the connection.
  6. Confirm the response. If the connection is successful, you will see a message that says "SSH-2.0-OpenSSH_5.3", as shown below. If you receive a PuTTY error, consult your system administrator for network access troubleshooting.

putty

[OSX/Linux] Test your configuration using telnet

  1. Open a terminal.
  2. Type telnet IPaddress 22 (Here, IPaddress is IP address listed in "Connections to OSC").
  3. Confirm the connection. 

Connections from OSC

All outbound network traffic from all of OSC's compute nodes are routed through a network address translation host (NAT), or two backup servers:

  • nat.osc.edu (192.157.5.13)
  • 192.148.248.35
  • 192.148.248.186

IT and Network Administrators

Please use the above information in order to assit users in acessing our resources.  

Occasionally new services may be stood up using hosts and ports not described here.  If you believe our list needs correcting please let us know at oschelp@osc.edu.

Supercomputer: 
Service: 

Messages from qsub

We have been adding some output from qsub that should aid you in creating better job scripts. We've documented the various messages here.

NOTE

A "NOTE" message is informational; your job has been submitted, but qsub made some assumptions about your job that you may not have intended.

No account/project specified

Your job did not specify a project to charge against, but qsub was able to select one for you. Typically, this will be because your username can only charge against one project, but it may be because you specified a preference by setting the OSC_DEFAULT_ACCOUNT environment variable. The output should indicate which project was assumed to be the correct one; if it was not correct, you should delete the job and resubmit after setting the correct job in the job script using the -A flag. For example:

#PBS -A PZS0530

Replace PZS0530 with the correct project code. Explicitly setting the -A flag will cause this informational message to not appear.

No memory limit set

Your job did not specify an explicit memory limit. Since we limit access to memory based on the number of cores set, qsub set this limit on your behalf, and will have mentioned in the message what the memory limit was set to.

You can suppress this informational message by explicitly setting the memory limit. For example:

#PBS -l mem=4gb

Please remember that the memory to core ratios are different on each cluster we operate. Please review the main documentation page for the cluster you are using for more information.

ERROR

A "ERROR" message indicates that your job was not submitted to the queue. Typically, this is because qsub is unsure of how to resolve an ambiguous setting in your job parameters. You will need to fix the problem in your job script, and resubmit.

You have not specified an account and have more than one available

Your username has the ability to charge jobs to more than one project, and qsub is unable to determine which one this job should be charged against. You can fix this by specifying the project using the -A flag. For example, you should add this line to your job script:

#PBS -A PZS0530

If you get this error, qsub will inform you of which projects you can charge against. Select the appropriate project, and replace "PZS0530" in the example above with the correct code.

You have the ability to tell qsub which project should be charged if no charge code is specified in the job script by setting the OSC_DEFAULT_ACCOUNT environment variable. For example, if you use the "bash" shell, you could put the line export OSC_DEFAULT_ACCOUNT=PZS0530, again replacing PZS0530 with the correct project code.

You have whitespace in a job name

Whitespace is not supported in TORQUE job names. Your job will be rejected with an error message if you submit a job with a space in the job name:

[xwang@owens-login02 torque_test]$ qsub job.txt 
ERROR:  Your job has not been submitted for the following reason:

        Script file "new" does not exist, requested job name has whitespace in it,
        or qsub command line is malformed.
qsub: Your job has been administratively rejected by the queueing system.
qsub: There may be a more detailed explanation prior to this notice.

You can fix this by removing whitespace in the job name.

Supercomputer: 
Service: 

Out-of-Memory (OOM) or Excessive Memory Usage

Problem description

A common problem on our systems is for a user job to run a node out of memory or to use more than its allocated share of memory if the node is shared with other jobs.

If a job exhausts both the physical memory and the swap space on a node, it causes the node to crash. With a parallel job, there may be many nodes that crash. When a node crashes, the systems staff has to manually reboot and clean up the node. If other jobs were running on the same node, the users have to be notified that their jobs failed.

If your job requests less than a full node, for example, -l nodes=1:ppn=1, it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested. For example, if a system has 4GB per core and you request one core, it is your responsibility to make sure your job uses no more than 4GB. Otherwise your job will interfere with the execution of other jobs.

The memory limit you set in PBS does not work the way one might expect it to. The only thing the -l mem=xxx flag is good for is requesting a large-memory node. It does not cause your job to be allocated the requested amount of memory, nor does it limit your job’s memory usage.
Note that even if your job isn’t causing problems, swapping is extremely inefficient. Your job will run orders of magnitude slower than it would with effective memory management.

Background

Each node has a fixed amount of physical memory and a fixed amount of disk space designated as swap space. If your program and data don’t fit in physical memory, the virtual memory system writes pages from physical memory to disk as necessary and reads in the pages it needs. This is called swapping. If you use up all the memory and all the swap space, the node crashes with an out-of-memory error.

This explanation really applies to the total memory usage of all programs running on the system. If someone else’s program is using too much memory, it may be pages from your program that get swapped out, and vice versa. This is the reason we aggressively terminate programs using more than their share of memory when there are other jobs on the node.

In the world of high performance computing, swapping is almost always undesirable. If your program does a lot of swapping, it will spend most of its time doing disk I/O and won’t get much computation done. You should consider the suggestions below.

You can find the amount of memory on our systems by following the links on our Supercomputers page. You can see the memory and swap values for a node by running the Linux command free on the node. As shown below, a standard node on Pitzer has 187GB physical memory and 47GB swap space.

[p0123]$ free -mh
              total        used        free      shared  buff/cache   available
Mem:           187G        8.9G        176G        626M        2.4G        176G
Swap:           47G          0B         47G​

Suggested solutions

Here are some suggestions for fixing jobs that use too much memory. Feel free to contact OSC Help for assistance with any of these options.

Some of these remedies involve requesting more processors (cores) for your job. As a general rule, we require you to request a number of processors proportional to the amount of memory you require. You need to think in terms of using some fraction of a node rather than treating processors and memory separately. If some of the processors remain idle, that’s not a problem. Memory is just as valuable a resource as processors.

Request whole node or more processors

Jobs requesting less than a whole node are those that have nodes=1 with ppn<40 on Pitzer, for example nodes=1:ppn=1. These jobs can be problematic for two reasons. First, they are entitled to use an amount of memory proportional to the ppn value requested; if they use more they interfere with other jobs. Second, if they cause a node to crash, it typically affects multiple jobs and multiple users.

If you’re sure about your memory usage, it’s fine to request just the number of processors you need, as long as it’s enough to cover the amount of memory you need. If you’re not sure, play it safe and request all the processors on the node.

Standard Pitzer nodes have 4761MB per core.

Reduce memory usage

Consider whether your job’s memory usage is reasonable in light of the work it’s doing. The code itself typically doesn’t require much memory, so you need to look mostly at the data size.

If you’re developing the code yourself, look for memory leaks. In MATLAB look for large arrays that can be cleared.

An out-of-core algorithm will typically use disk more efficiently than an in-memory algorithm that relies on swapping. Some third-party software gives you a choice of algorithms or allows you to set a limit on the memory the algorithm will use.

Use more nodes for a parallel job

If you have a parallel job you can get more total memory by requesting more nodes. Depending on the characteristics of your code you may also need to run fewer processes per node.

Here’s an example. Suppose your job on Pitzer includes the following lines:

#PBS -l nodes=2:ppn=40
…
mpiexec mycode

This job uses 2 nodes, so it has 2*183=366GB total memory available to it. The mpiexec command by default runs one process per core, which in this case is 2*40=80 copies of mycode.

If this job uses too much memory you can spread those 80 processes over more nodes. The following lines request 4 nodes, giving you a total of 4*183=732GB total memory. The -ppn 20 option on the mpiexec command says to run 20 processes per node instead of 40, for a total of 80 as before.

#PBS -l nodes=4:ppn=40
…
mpiexec -ppn 20 mycode

Since parallel jobs are always assigned whole nodes, the following lines will also run 20 processes per node on 4 nodes.

#PBS -l nodes=4:ppn=20
…
mpiexec mycode

Request large-memory nodes

Pitzer has four huge memory nodes with 3TB of memory and with 80 cores. Owens has sixteen huge memory nodes with 1.5 TB of memory and with 48 cores.

Since there are so few of these nodes, compared to hundreds of standard nodes, jobs requesting them will often have a long wait in the queue. The wait will be worthwhile, though, If these nodes solve your memory problem.

To use the huge memory nodes on Pitzer, request the whole node (ppn=80). 

Example:

#PBS -l nodes=1:ppn=80
…

To use the huge-memory node on Owens you request the whole node (ppn=48) as well.

#PBS -l nodes=1:ppn=48
…

Put a virtual memory limit on your job

The sections above are intended to help you get your job running correctly. This section is about forcing your job to fail gracefully if it consumes too much memory. If your memory usage is unpredictable, it is preferable to terminate the job when it exceeds a memory usage limit rather than allow it to crowd other jobs or crash a node.

The memory limit enforced by PBS is ineffective because it only limits physical memory usage (resident set size or RSS). When your job reaches its memory limit it simply starts using virtual memory, or swap. PBS allows you to put a limit on virtual memory, but that has problems also.

We will use Linux terminology. Each process has several virtual memory values associated with it. VmSize is virtual memory size; VmRSS is resident set size, or physical memory used; VmSwap is swap space used. The number we care about is the total memory used by the process, which is VmRSS + VmSwap. What PBS allows a job to limit is VmRSS (using -l mem=xxx) or VmSize (using -l vmem=xxx).

The relationship among VmSize, VmRSS, and VmSwap is:  VmSize >= VmRSS+VmSwap. For many programs this bound is fairly tight; for others VmSize can be much larger than the memory actually used.

If the bound is reasonably tight, -l vmem=4gb provides an effective mechanism for limiting memory usage to 4gb (for example). If the bound is not tight, VmSize may prevent the program from starting even if VmRSS+VmSwap would have been perfectly reasonable. Java and some FORTRAN 77 programs in particular have this problem.

The vmem limit in PBS is for the entire job, not just one node, so it isn’t useful with parallel (multimode) jobs. PBS also has a per-process virtual memory limit, pvmem. This limit is trickier to use, but it can be useful in some cases.

Here are suggestions for some specific cases.

Serial (single-node) job using program written in C/C++

This case applies to programs written in any language if VmSize is not much larger than VmRSS+VmSwap. If your program doesn’t use any swap space, this means that vmem as reported by qstat -f or the ja command (see below) is not much larger mem as reported by the same tools.

Set the vmem limit equal to, or slightly larger than, the number of processors requested (ppn) times the memory available per processor. Example for Pitzer:

#PBS -l nodes=1:ppn=1
#PBS -l vmem=4761mb
Parallel (multinode) job using program written in C/C++

This suggestion applies if your processes use approximately equal amounts of memory. See also the comments about other languages under the previous case.

Set the pvmem limit equal to, or slightly larger than, the amount of physical memory on the node divided by the number of processes per node. Example for Pitzer, running 40 processes per node:

#PBS -l nodes=2:ppn=40
#PBS -l pvmem=4761mb
…
mpiexec mycode
Serial (single-node) job using program written in Java

I’ve only slightly tested this suggestion so far, so please provide feedback to oschelp@osc.edu.

Start Java with a virtual memory limit equal to, or slightly larger than, the number of processors requested (ppn) times the memory available per processor. Example for Pitzer:

#PBS -l nodes=1:ppn=1
#PBS -l vmem=4761mb
…
java -Xms4761m -Xmx4761m MyJavaCode
Other situations

If you have other situations that aren’t covered here, please share them. Contact oschelp@osc.edu.

How to monitor your memory usage

qstat -f

While your job is running the command qstat -f jobid will tell you the peak physical and virtual memory usage of the job so far. For a parallel job, these numbers are the aggregate usage across all nodes of the job. The values reported by qstat may lag the true values by a couple of minutes.

free

For parallel (multinode) jobs you can check your per-node memory usage while your job is running by using pdsh -j jobid free -m.

ja

You can put the command ja (job accounting) at the end of your batch script to capture the resource usage reported by qstat -f. The information will be written to your job output log, job_name.o123456.

OnDemand

You can also view node status graphically the OSC OnDemand Portal (ondemand.osc.edu).  Under "Jobs" select "Active Jobs". Click on "Job Status" and scroll down to see memory usage. This shows the total memory usage for the node; if your job is not the only one running there, it may be hard to interpret.

Below is a typical graph for jobs using too much memory. It shows two jobs that ran back-to-back on the same node. The first peak is a job that used all the available physical memory (blue) and a large amount of swap (purple). It completed successfully without crashing the node. The second job followed the same pattern but actually crashed the node.

Notes

If it appears that your job is close to crashing a node, we may preemptively delete the job.

If your job is interfering with other jobs by using more memory than it should be, we may delete the job.

In extreme cases OSC staff may restrict your ability to submit jobs. If you crash a large number of nodes or continue to submit problem jobs after we have notified you of the situation, this may be the only way to protect the system and our other users. If this happens, we will restore your privileges as soon as you demonstrate that you have resolved the problem.

For details on retrieving files from unexpectedly terminated jobs see this FAQ.

For assistance

OSC has staff available to help you resolve your memory issues. See our Support Services page for contact information.

System Email

Occasionally, jobs that experience problems may generate emails from staff or automated systems at the center with some information about the nature of the problem. These pages provide additional information about the various emails sent, and steps that can be taken to address the problem.

Batch job aborted

Purpose

Notify you when your job terminates abnormally.

Sample subject line

PBS JOB 944666.pitzer-batch.osc.edu

Apparent sender

  • root <adm@pitzer-batch.osc.edu> (Pitzer)
  • root <adm@owens-batch.osc.edu> (Owens)

Sample contents

PBS Job Id: 93561.pitzer-batch.osc.edu
Job Name:   mailtest.job
Exec host:  p0178/1
Aborted by PBS Server
Job exceeded some resource limit (walltime, mem, etc.). The job was aborted See Administrator for help

Sent under these circumstances

These are fully automated emails send by the batch system.

Some reasons a job might terminate abnormally:

  • The job exceeded its allotted walltime, memory, virtual memory, or other limited resources. More information is available in your job log file, e.g., jobname.o123456.
  • An unexpected system problem caused your job to fail.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

For advice on monitoring and controlling resource usage, see Monitoring and Managing Your Job.

There’s not much you can do about system failures, which fortunately are rare.

Notes

Under some circumstances, you can retrieve your job output log if your job aborts due to a system failure. Contact oschelp@osc.edu for assistance.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Supercomputer: 

Batch job begin or end

Purpose

Notify you when your job begins or ends.

Sample subject line

PBS JOB 944666.pitzer-batch.osc.edu

Apparent sender

  • root <adm@pitzer-batch.osc.edu> (Pitzer)
  • root <adm@owens-batch.osc.edu> (Owens)

Sample contents

PBS Job Id: 944666.pitzer-batch.osc.edu
Job Name:   mailtest.job
Exec host:  p0178/1
Begun execution
 
PBS Job Id: 944666.pitzer-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/1
Execution terminated
Exit_status=0
resources_used.cput=00:00:00
resources_used.mem=2228kb
resources_used.vmem=211324kb
resources_used.walltime=00:01:00

Sent under these circumstances

These are fully automated emails sent by the batch system. You control them through the headers in your job script. The following line requests emails at the beginning, ending, and abnormal termination of your job.

#PBS -m abe

To turn off the emails

Remove the -m option from your script and/or command line or use -m n. See PBS Directives Summary.

Notes

You can add the following command at the end of your script to have resource information written to your job output log:

ja

For more information

See PBS Directives Summary.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Supercomputer: 

Batch job deleted by an administrator

Purpose

Notify you when your job is deleted by an administrator.

Sample subject line

PBS JOB 4717533.owens-batch.ten.osc.edu

Apparent sender

  • root adm@pitzer-batch.osc.edu (Pitzer)
  • root adm@owens-batch.osc.edu (Owens)

Sample contents

PBS Job Id: 4717533.owens-batch.ten.osc.edu
Job Name:   mailtest.job
job deleted
Job deleted at request of staff@owens-login01.hpc.osc.edu Job using too much memory. Contact oschelp@osc.edu.

Sent under these circumstances

These emails are sent automatically, but the administrator can add a note with the reason.

Some reasons a running job might be deleted:

  • The job is using so much memory that it threatens to crash the node it is running on.
  • The job is using more resources than it requested and is interfering with other jobs running on the same node.
  • The job is causing excessive load on some part of the system, typically a network file server.
  • The job is still running at the start of a scheduled downtime.

Some reasons a queued job might be deleted:

  • The job requests non-existent resources.
  • A job intended for one system that was submitted on another one.
  • The job can never run because it requests combinations of resources that are disallowed by policy.
  • The user’s credentials are blocked on the system the job was submitted on.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

See the Supercomputing FAQ for suggestions on dealing with specific problems.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.

Emails exceeded the expected volume

Purpose

Notify you that we have placed a hold on emails sent to you from the HPC system.

Sample subject line

Emails sent to email address student@buckeyemail.osu.edu in the last hour exceeded the expected volume

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

When a job fails or is deleted by an administrator, the system sends you an email. If this happens with a large number of jobs, it generates a volume of email that may be viewed as spam by your email provider. To avoid having OSC blacklisted, and to avoid overloading your email account, we hold your emails from OSC.

Please note that these held emails will eventually be deleted if you do not contact us.

Sent under these circumstances

These emails are sent automatically when your email usage from OSC is deferred.

To turn off the emails

Turn off emails related to your batch jobs to reduce your overall email volume from OSC. See the -m option on the PBS Directives Summary page.

Notes

To re-enable email you must contact OSC Help.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

 

 

Job failure due to a system hardware problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a hardware problem.

Sample subject line

Failure of job(s) 919137 due to a hardware problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job.

Sent under these circumstances

These emails are sent by a systems administrator after a node crashes.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

Hardware crashes are quite rare and in most cases there’s nothing you can do to prevent them.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to a system software problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a system software problem.

Sample subject line

Failure of job(s) 919137 due to a system software problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job. Usually the problems are caused by another job running on the node.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

If you request a whole node your jobs will be less susceptible to problems caused by other jobs. Other than that, be assured that we work hard to keep jobs from interfering with each other.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to exhaustion of physical memory

Purpose

Notify you that one or more of your jobs caused compute nodes to crash with an out-of-memory error.

Sample subject line

Failure of job(s) 933014,933174 at OSC due to exhaustion of physical memory

Apparent sender

OSC Help <oschelp@osc.edu>

Explanation

Your job(s) exhausted both physical memory and swap space during job execution. This failure caused the compute node(s) used by the job(s) to crash, requiring a reboot.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

You cannot turn off these emails. Please don’t ignore them because they report a problem that you must correct.

To prevent these problems

See the Knowledge Base article "Out-of-Memory (OOM) or Excessive Memory Usage" for suggestions on dealing with out-of-memory problems.

For information on the memory available on the various systems, see our Supercomputing page.

Notes

If you continue to submit jobs that cause these problems, your HPC account may be blocked.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.

XDMoD Tool

XDMoD Overview

XDMoD, which stands for XD Metrics on Demand, is an NSF-funded open source tool that provides a wide range of metrics pertaining to resource utilization and performance of high-performance computing (HPC) resources, and the impact these resources have in terms of scholarship and research.

How to log in

Visit OSC's XDMoD (xdmod.osc.edu) and click 'Sign In' in the upper left corner of the page.

screenshot of the XDMoD displaying the above text

A login window will appear. Click the button 'Login here.' under the 'Sign in with Ohio SuperComputer Center:', as shown below:
screenshot of the XDMoD displaying the above text
 
This redirects to a login page where one can use their OSC credentials to sign in.
screenshot of the XDMoD displaying the above text

XDMoD Tabs

When you first log in you will be directed to the Summary tab. The different XDMoD tabs are located near the top of the page. You will be able to change tabs simply by click on the one you would like to view. By default, you will see the data from the previous month, but you can change the start and end date and then click 'refresh' to update the timeframe being reported.

screenshot of the XDMoD displaying the above text

Summary:

The Summary tab is comprised of a duration selector toolbar, a summary information bar, followed by a select set of charts representative of the usage. The Summary tab provides a dashboard that presents summary statistics and selected charts that are useful to the role of the current user. More information can be found at the XDMoD User Manual

Usage:

The Usage tab is comprised of a chart selection tree on the left, and a chart viewer to the right of the page. The usage tab provides a convenient way to browse all the realms present in XDMoD. More information can be found at the XDMoD User Manual

Metric Explorer:

The Metric Explorer allows one to create complex plots containing multiple multiple metrics. It has many points and click features that allow the user to easily add, filter, and modify the data and the format in which it is presented. More information can be found at the XDMoD User Manual

App Kernels:

The Application Kernels tab consists of three sub-tabs, and each has a specific goal in order to make viewing application kernels simple and intuitive. The three sub-tabs consist of the Application Kernels Viewer, Application Kernels Explorer, and the Reports subsidiary tabs. More information can be found at the XDMoD User Manual

Report Generator:

This tab will allow you to manage reports. The left region provides a listing of any reports you have created. The right region displays any charts you have chosen to make available for building a report. More information can be found at the XDMoD User Manual

Job Viewer:

The Job Viewer tab displays information about individual HPC jobs and includes a search interface that allows jobs to be selected based on a wide range of filters. This tab also contains the SUPReMM module. More information on the SUPReMM module can be found below in this documentation. More information can be found at the XDMoD User Manual

About:

This tab will display information about XDMoD.

Different Roles

XDMoD utilizes roles to restrict access to data and elements of the user interface such as tabs. OSC client holds the 'User Role' by default after you log into OSC XDMoD using your OSC credentials. With 'User Role', users are able to view all data available to their personal utilization information. They are also able to view information regarding their allocations, quality of service data via the Application Kernel Explorer, and generate custom reports. We also support the 'Principal Investigator' role, who has access to all data available to a user, as well as detailed information for any users included on their allocations or project.

References, Resources, and Documentation

 

 

Supercomputer: 

Job Viewer

The Job Viewer Tab displays information about individual HPC jobs and includes a search interface that allows jobs to be selected based on a wide range of filters:

1. Click on the Job Viewer tab near the top of the page.

2. Click Search in the top left-hand corner of the page

screenshot of the XDMoD displaying the above text

     3. If you know the Resource and Job Number, use the quick search lookup form discussed in 4a. If you would like more options, use the advanced search discussed in 4b.

     4a. For a quick job lookup, select the resource and enter the job number and click 'Search'.

screenshot of the XDMoD displaying the above text

     4b. Within the Advanced Search form, select a timeframe and Add one or more filters. Click to run the search on the server.

screenshot of the XDMoD displaying the above text

     5. Select one or more Jobs. Provide the 'Search Name', and click 'Save Results' at the bottom of this window to view data about the selected jobs.

     6. To view data in more details for the selected job, under the Search History, click on the Tree and select a Job.

     7. More information can be found in the section of 'Job Viewer' of the XDMoD User Manual.

Supercomputer: 

XDMoD - Checking Job Efficiency

Intro

XDMoD can be used to look at the performance of past jobs. This tutorial will explain how to retreive this job performance data and how to use this data to best utilize OSC resources.

First, log into XDMoD.

See XDMoD Tool webpage for details about XDMoD and how to log in.

You will be sent to the Summary Tab in XDMoD:

Screen Shot 2019-03-28 at 11.04.53 AM.png

Click on the Metric Explorer tab, then navigate to the Metric Catalog click SUPREMM to show the various metric options, then Click the "Avg CPU %: User: weighted by core hour " metric.

A drop-down menu will appear for grouping the data to viewed. Group by "CPU User Value

Screen Shot 2019-04-03 at 2.15.23 PM_0.png":

 

This will provide a time-series chart showing the average 'CPU user % weighted by core hours, over all jobs that were executing' separated by groups of 10 for that 'CPU User value'.

Screen Shot 2019-04-03 at 2.21.10 PM.png

One can change the time period by adjusting the preset duration value or entering dates in the "start" and "end" boxes by selecting the calender or manually entering dates in the format 'yyyy-mm-dd'. Once the desired time period is entered the "Refresh" button will be highlighted yellow, click the "Refresh" button to reload that time period data into the chart.

Screen Shot 2019-03-28 at 11.38.25 AM.png

Once the data is loaded, click on one of the data points, then navigate to "Drilldown" and select "Job Wall Time". This will group the job data by the amount of wall time used.

Screen Shot 2019-04-03 at 2.28.30 PM.png

Generally, the lower the CPU User Value, the less efficient that job was. This chart can now be used to go into some detailed information on specific jobs. Click one of the points again and select "Show raw data".

Screen Shot 2019-03-28 at 3.24.50 PM.png

This will bring up a list of jobs included in that data point. Click one of the jobs shown.

Screen Shot 2019-03-28 at 3.25.21 PM.png

After loading, this brings up the "Job Viewer" Tab for showing the details about the job selected.

Screen Shot 2019-03-28 at 3.28.57 PM.png

It is important to explain some information about the values immediately visible such as the "CPU User", "CPU User Balance" and "Memory Headroom".

The "CPU User" section gives a ratio for the amount of CPU time used by the job during the time that job was executing, think of it as how much "work" the CPUs were doing doing execution.

Screen Shot 2019-03-28 at 3.32.30 PM.png

The "CPU User Balance" section gives a measure for how evenly spread the "work" was between all the CPUs that were allocated to this job while it was executing. (Work here means how well was the CPU utilized, and it is preferred that the CPUs be close to fully utilized during job execution.)

Screen Shot 2019-03-28 at 3.32.44 PM.png

Finally, "Memory Headroom" gives a measure for the amount of memory used for that job. It can be difficult to understand what a good value is here. Generally, it is recommended to not specifically request an amount of memory unless the job requires it. When making those memory requests, it can be beneficial to investigate the amount of memory that is actually used by the job and plan accordingly. Below, a value closer to 0 means a job used most of the memory allocated to it and a value closer to 1 means that the job used less memory than the job was allocated.

Screen Shot 2019-03-28 at 3.32.55 PM.png

This information is useful for better utilizing OSC resources by having better estimates of the resources that jobs may require.