Knowledge Base

This knowledge base is a collection of important, useful information about OSC systems that does not fit into a guide or tutorial, and is too long to be answered in a simple FAQ.

Account Consolidation Guide

Account consolidation will take place during the July 17th downtime. Jobs submitted by non-preferred accounts that do not run prior to the downtime will be deleted, as the jobs will fail after the downtime. Please be aware of this, and be prepared to resubmit your held jobs from your consolidated account with the correct charge code.
Please contact OSC Help if you need further information. 

Single Account / Multiple Projects

If you work with several research groups, you have a separate account for each group. This means multiple home directories, multiple passwords, etc. Over the years there have been requests for a single login system. We're now putting that in place.

How will this affect you?

If you have only one account, you'll see no changes. But if you work with multiple groups in the future, you'll need to be aware of how this works.

  • All users with multiple accounts have been granted a preferred username that they will use to log into our systems. These were communicated to impacted clients on July 11.
  • It will be very important to use the correct project code for batch job charging.
  • Managing the sharing of files between your projects (groups) will be a little more complicated.
  • In most cases, you will only need to fill out software license agreements once.

The single username 

We requested those with multiple accounts to choose a preferred username. If one was not selected by the user, we selected one for them. 

The preferred username will be your only active account; you will not be able to log in or submit jobs with the other accounts. 

Checking the groups of a username

To check all groups of a username (USERID), use the command:

id -Gn USERID

The first one from the output is your primary group, which is the project code (PROJECTID) this username (USERID) was created under.

To check all project codes your user account is under, use the command

id -Gn USERID | awk -F '[ ]' '{for (i=1;i<=NF;i++){if ($i ~/^P/) {print $i}}}'

Changing the primary group for a login session

You can change the primary group of your username (USERID) to any UNIX group (GROUP) that username (USERID) belongs to during the login session using the command:

newgrp GROUP

This change is only valid during this login session. If you log out and log back in, your primary group is changed back to the default one.

Check previous user accounts

There is no available tool to check all of your previous active accounts. We sent an email to each impacted user providing the information on your preferred username and previous accounts. Please refer to that email (sent on July 11, subject "Multiple OSC Accounts - Your Single Username").

Batch job

How to specify the charging project

If you work with an annual allocation project, you're already familiar with the '#PBS -A' option for charging against a specific project.  

It will be very important that you make sure a batch job is charged against the correct research project code, including annual allocations. While the system can ensure you don't inadvertently charge an unrelated project, we can't make sure that you're charging the correct project.

Note:
At times, PIs would ask us to check that group members were correctly charging an annual allocation. We could do that by comparing the primary group with the charged code. With single username multiple groups (project codes), we will not be able to do this kind of check.

Batch limits policy

The job limit per user remains the same. That is to say, though your jobs are charged against different project codes, the total number of jobs and cores your user account can use on each system is still restricted by the previous user-based limit. Therefore, consolidating multiple user accounts into one preferred user account may affect the work of some users.

Please check our batch limit policy on each system for more details.

Data Management

Managing multiple home directories

Data from your non-preferred accounts will remain in those home directories; the ownership of the files will be updated to your preferred username, the newly consolidated account. You can access your other home directories using the command cd /absolute/path/to/file

You will need to consolidate all files to your preferred username as soon as possible because we plan to purge the data in future. Please contact OSC Help if you need the information on your other home directories to access the files.  

Previous files associated with your other usernames

  • Files associated with your non-preferred accounts will have their ownership changed to your preferred username. 
  • These files won't count against your home directory file quota. 
  • There will be no change to files and quotas on the project and scratch file systems.

Change group of a file

Log in with preferred username (P_ USERID) and create a new file of which the owner and group is your preferred username (P_ USERID) and primary project code (P_PROJECTID). Then change the group of the newly created file (FILE) using the command:

chgrp PROJECTID FILE

Managing file sharing in a batch job

In the Linux file system, every file has an owner and a group. By default, the group (project code) assigned to a file is the primary group of the user who creates it. This means that even if you change the charged account for a batch job, any files created will still be associated with your primary group.

To change the group for new files you will need to include:

-W group_list=[group-id]

at the beginning of your batch job. If you are a member of research groups PQR1234 (primary) and PQR5678 then batch jobs run for PQR5678 will typically have:

#PBS -A PQR5678
#PBS -W group_list=PQR5678

It is important to remember that groups are used in two different ways: for resource use charging and file permissions. In the simplest case, if you are a member of only one research group/project, you won't need either option above. If you are in multiple research groups and/or have annual allocations, you may need something like:

#PBS -A PAA1111
#PBS -W group_list=PQR5678

OnDemand users

If you use the OnDemand Files app to upload files to the OSC filesystem, the group ownership of uploaded files will be your primary group.

Software licenses

  • We will merge all your current agreements if you have multiple accounts.  
  • In many cases, you will only need to fill out software license agreements once.
  • Some vendors may require you to sign an updated agreement.  
  • Some vendors may also require the PI of each of your research groups/project codes to sign an agreement.
Supercomputer: 

Changes of Default Memory Limits

Problem Description

Our current GPFS file system is a distributed process with significant interactions between the clients. As the compute nodes being GPFS flle system clients, a certain amount of memory of each node needs to be reserved for these interactions. As a result, the maximum physical memory of each node allowed to be used by users' jobs are reduced, in order to keep the healthy performance of the file system. In addition, using swap memory is not allowed anymore. 

The table below summarizes the maximum physical memory allowed for each type of nodes on our systems:

Oakley Cluster

Node type physical memory per node Maximum memory allowed per Node 
Regular node 48GB 45GB
Big memory node 192GB 187GB
Huge memory node 1024GB (1TB) 1008GB

Ruby Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 64GB 61GB
Debug node 128GB 124GB
Huge memory node 1024GB (1TB) 1008GB

Owens Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 128GB 124GB
Huge memory node 1536GB

1510GB

Solutions When You Need Regular Nodes

Starting from October 27, 2016, we'll implement a new scheduling policy on all of our clusters, reflecting the reduced default memory limits. 

If you do not request memory explicitly in your job (no -l mem

Your job can be submitted and scheduled as before, and resouces will be allocated according to your requests of cores/nodes ( nodes=XX:ppn=XX ).  If you request partial node, the memory allocated to your job is proportional to the number of cores requested (4GB/core on Oakley and Owens); if you request the whole node, the memory allocated to your job is decreased, following the information summarized in the above tables. Some examples are provided below.

A request of partial node:

On Oakley, a request of nodes=1:ppn=1  will be allocated with 4GB memory, and charged for 1 core.  A request of  nodes=1:ppn=4  will be allocated with 16GB memory, and charged for 4 cores. A request of  nodes=1:ppn=11  will be allocated with 44GB memory, and charged for 11 cores. 

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.  

On Owens, a request of  nodes=1:ppn=1   will be allocated with 4GB memory, and charged for 1 core. A request of  nodes=1:ppn=4  will be allocated with 16GB memory, and charged for 4 cores.

A request of the whole node:

A request of the whole regular node will be allocated with maximum memory allowed per node and charged for the whole node, as summarized below:

  Request memory allocated charged for
Oakley nodes=1:ppn=12  45GB 12 cores
Ruby nodes=1:ppn=20  61GB 20 cores
Owens nodes=1:ppn=28 124GB 28 cores

A request of multiple nodes:

If you have a multi-node job (  nodes>1  ), your job will be assigned the entire nodes with maximum memory allowed per node (45GB on Oakley, 61GB for Ruby, and 124GB for Owens) and charged for the entire nodes regardless of ppn request.

If you do request memory explicitly in your job (with  -l mem 

If you request memory explicily in your scirpt, please re-visit your script according to the following information. 

A request of partial node:

On Oakley, a request of  nodes=1:ppn=1,mem=4gb   will be allocated with 4GB memory, and charged for 1 core; a request of  nodes=1:ppn=2,mem=8gb   will be allocated with 8GB memory, and charged for 2 cores; a request of  nodes=1:ppn=1,mem=40gb   will be allocated with 40GB memory, and charged for 10 cores.

On Owens, a request of  nodes=1:ppn=1, mem=4gb   will be allocated with 4GB memory, and charged for 1 core.

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job. 

 A request of the whole node:

On Oakley, the maximum value you can use for -l mem is 45gb, i.e. -l mem=45gb. A request of  nodes=1:ppn=12,mem=45gb will be allocated with 45GB memory, and charged for the whole node. If you need more than 45GB memory for the job, please submit your job to big/huge memory nodes on Oakley, or switch to Owens cluster. Any request requesting mem>45gb may be re-scheduled on big memory node on Oakley, or will not be scheduled, depending on what you put in the request. 

On Ruby, the maximum value you can use for -l mem is 61gb, i.e. -l mem=61gb. A request of  nodes=1:ppn=20,mem=61gb will be allocated with 61GB memory, and charged for the whole node. If you need more than 61GB memory for the job, please submit your job to huge memory nodes on Ruby, or switch to Owens cluster. Any request requesting mem>61gb will not be scheduled. 

On Owens, the maximum value you can use for -l mem is 125gb, i.e. -l mem=125gb. A request of  nodes=1:ppn=28,mem=124gb will be allocated with 124GB memory, and charged for the whole node. If you need more than 124GB memory for the job, please submit your job to huge memory nodes. Any request requesting mem=>126gb will not be scheduled. 

A request of multiple nodes:

If you have a multi-node job (   nodes>1), your job will be assigned the entire nodes with maximum memory allowed per node (45GB on Oakley, 61GB for Ruby, and 124GB for Owens) and charged for the entire nodes.

Solutions When You Need Special Nodes

It is highly recommended that you do not put any memory request and follow the syntax below if you need any special resources.

 Oakley Cluster:

node type how to request MEMORY ALLOCATED CHARGED FOR
Big memory node

nodes=XX:ppn=12:bigmem

(XX can be 1-8)

187GB 12 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Ruby Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Debug node nodes=1:ppn=16 -q debug 124GB 16 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Owens Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=48 1510GB 48 cores

 

Supercomputer: 

Compilation Guide

As a general recommendation, we suggest selecting the newest compilers available for a new project. For repeatability, you may not want to change compilers in the middle of an experiment.

Pitzer Compilers

The Skylake processors that make up Pitzer support the AVX512 instruction set, but you must set the correct compiler flags to take advantage of it. AVX512 has the potential to speed up your code by a factor of 8 or more, depending on the compiler and options you would otherwise use.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Pitzer. The executables will not be portable.

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The   -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above:  -qopenmp  

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran   or   pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx

Recommended Optimization Options

The   -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above:  -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The  -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above:  -fopenmp

 

Owens Compilers

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The   -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above:  -qopenmp  or  -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran   or   pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx

Recommended Optimization Options

The   -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above:  -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The  -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above:  -fopenmp

 

Ruby Compilers

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The  -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above: -qopenmp or -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran  or  pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx
NOTE: The C++ compiler used to be pgCC, but newer versions of PGI do not support this name.

Recommended Optimization Options

The  -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above: -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above: -fopenmp

 

Further Reading:

Intel Compiler Page

PGI Compiler Page

GNU Complier Page

Supercomputer: 
Technologies: 
Fields of Science: 

Firewall and Proxy Settings

Connections to OSC

In order for users to access OSC resources through the web your firewall rules should allow for connections to the following publicly-facing IP ranges.  Otherwise, users may be blocked or denied access to our services.

  • 192.148.248.0/24
  • 192.148.247.0/24
  • 192.157.5.0/25

The followingg TCP ports should be opened:

  • 80 (HTTP)
  • 443 (HTTPS)
  • 22 (SSH)

The following domain should be allowed:

  • *.osc.edu

Users may follow the instructions below "Test your configuration" to ensure that your system is not blocked from accessing our services. If you are still unsure of whether their network is blocking theses hosts or ports should contact their local IT administrator.

Test your configuration

[Windows] Test your connection using PuTTY

  1. Open the PuTTY application.
  2. Enter IP address listed in "Connections to OSC" in the "Host Name" field.
  3. Enter 22 in the "Port" field.
  4. Click the 'Telnet' radio button under "Connection Type".
  5. Click "Open" to test the connection.
  6. Confirm the response. If the connection is successful, you will see a message that says "SSH-2.0-OpenSSH_5.3", as shown below. If you receive a PuTTY error, consult your system administrator for network access troubleshooting.

putty

[OSX/Linux] Test your configuration using telnet

  1. Open a terminal.
  2. Type telnet IPaddress 22 (Here, IPaddress is IP address listed in "Connections to OSC").
  3. Confirm the connection. 

Connections from OSC

All outbound network traffic from all of OSC's compute nodes are routed through a network address translation host (NAT), or two backup servers:

  • nat.osc.edu (192.157.5.13)
  • 192.148.248.35
  • 192.148.248.186

IT and Network Administrators

Please use the above information in order to assit users in acessing our resources.  

Occasionally new services may be stood up using hosts and ports not described here.  If you believe our list needs correcting please let us know at oschelp@osc.edu.

Supercomputer: 
Service: 

Messages from qsub

We have been adding some output from qsub that should aid you in creating better job scripts. We've documented the various messages here.

NOTE

A "NOTE" message is informational; your job has been submitted, but qsub made some assumptions about your job that you may not have intended.

No account/project specified

Your job did not specify a project to charge against, but qsub was able to select one for you. Typically, this will be because your username can only charge against one project, but it may be because you specified a preference by setting the OSC_DEFAULT_ACCOUNT environment variable. The output should indicate which project was assumed to be the correct one; if it was not correct, you should delete the job and resubmit after setting the correct job in the job script using the -A flag. For example:

#PBS -A PZS0530

Replace PZS0530 with the correct project code. Explicitly setting the -A flag will cause this informational message to not appear.

No memory limit set

Your job did not specify an explicit memory limit. Since we limit access to memory based on the number of cores set, qsub set this limit on your behalf, and will have mentioned in the message what the memory limit was set to.

You can suppress this informational message by explicitly setting the memory limit. For example:

#PBS -l mem=4gb

Please remember that the memory to core ratios are different on each cluster we operate. Please review the main documentation page for the cluster you are using for more information.

ERROR

A "ERROR" message indicates that your job was not submitted to the queue. Typically, this is because qsub is unsure of how to resolve an ambiguous setting in your job parameters. You will need to fix the problem in your job script, and resubmit.

You have not specified an account and have more than one available

Your username has the ability to charge jobs to more than one project, and qsub is unable to determine which one this job should be charged against. You can fix this by specifying the project using the -A flag. For example, you should add this line to your job script:

#PBS -A PZS0530

If you get this error, qsub will inform you of which projects you can charge against. Select the appropriate project, and replace "PZS0530" in the example above with the correct code.

You have the ability to tell qsub which project should be charged if no charge code is specified in the job script by setting the OSC_DEFAULT_ACCOUNT environment variable. For example, if you use the "bash" shell, you could put the line export OSC_DEFAULT_ACCOUNT=PZS0530, again replacing PZS0530 with the correct project code.

You have whitespace in a job name

Whitespace is not supported in TORQUE job names. Your job will be rejected with an error message if you submit a job with a space in the job name:

[xwang@owens-login02 torque_test]$ qsub job.txt 
ERROR:  Your job has not been submitted for the following reason:

        Script file "new" does not exist, requested job name has whitespace in it,
        or qsub command line is malformed.
qsub: Your job has been administratively rejected by the queueing system.
qsub: There may be a more detailed explanation prior to this notice.

You can fix this by removing whitespace in the job name.

Supercomputer: 
Service: 

Out-of-Memory (OOM) or Excessive Memory Usage

Problem description

A common problem on our systems is for a user job to run a node out of memory or to use more than its allocated share of memory if the node is shared with other jobs.

If a job exhausts both the physical memory and the swap space on a node, it causes the node to crash. With a parallel job, there may be many nodes that crash. When a node crashes, the systems staff has to manually reboot and clean up the node. If other jobs were running on the same node, the users have to be notified that their jobs failed.

If your job requests less than a full node, for example, -l nodes=1:ppn=1, it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested. For example, if a system has 4GB per core and you request one core, it is your responsibility to make sure your job uses no more than 4GB. Otherwise your job will interfere with the execution of other jobs.

The memory limit you set in PBS does not work the way one might expect it to. The only thing the -l mem=xxx flag is good for is requesting a large-memory node. It does not cause your job to be allocated the requested amount of memory, nor does it limit your job’s memory usage.
Note that even if your job isn’t causing problems, swapping is extremely inefficient. Your job will run orders of magnitude slower than it would with effective memory management.

Background

Each node has a fixed amount of physical memory and a fixed amount of disk space designated as swap space. If your program and data don’t fit in physical memory, the virtual memory system writes pages from physical memory to disk as necessary and reads in the pages it needs. This is called swapping. If you use up all the memory and all the swap space, the node crashes with an out-of-memory error.

This explanation really applies to the total memory usage of all programs running on the system. If someone else’s program is using too much memory, it may be pages from your program that get swapped out, and vice versa. This is the reason we aggressively terminate programs using more than their share of memory when there are other jobs on the node.

In the world of high performance computing, swapping is almost always undesirable. If your program does a lot of swapping, it will spend most of its time doing disk I/O and won’t get much computation done. You should consider the suggestions below.

You can find the amount of memory on our systems by following the links on our Supercomputers page. You can see the memory and swap values for a node by running the Linux command free on the node. As shown below, a standard node on Oakley has 48GB physical memory and 46GB swap space.

[n0123]$ free -mo
             total       used       free     shared    buffers     cached
Mem:         48386       2782      45603          0        161       1395
Swap:        46874          0      46874

Suggested solutions

Here are some suggestions for fixing jobs that use too much memory. Feel free to contact OSC Help for assistance with any of these options.

Some of these remedies involve requesting more processors (cores) for your job. As a general rule we require you to request a number of processors proportional to the amount of memory you require. You need to think in terms of using some fraction of a node rather than treating processors and memory separately. If some of the processors remain idle, that’s not a problem. Memory is just as valuable a resource as processors.

Request whole node or more processors

Jobs requesting less than a whole node are those that have nodes=1 with ppn<12 on Oakley, for example nodes=1:ppn=1. These jobs can be problematic for two reasons. First, they are entitled to use an amount of memory proportional to the ppn value requested; if they use more they interfere with other jobs. Second, if they cause a node to crash, it typically affects multiple jobs and multiple users.

If you’re sure about your memory usage, it’s fine to request just the number of processors you need, as long as it’s enough to cover the amount of memory you need. If you’re not sure, play it safe and request all the processors on the node.

Standard Oakley nodes have 4GB per core.

Reduce memory usage

Consider whether your job’s memory usage is reasonable in light of the work it’s doing. The code itself typically doesn’t require much memory, so you need to look mostly at the data size.

If you’re developing the code yourself, look for memory leaks. In MATLAB look for large arrays that can be cleared.

An out-of-core algorithm will typically use disk more efficiently than an in-memory algorithm that relies on swapping. Some third-party software gives you a choice of algorithms or allows you to set a limit on the memory the algorithm will use.

Use more nodes for a parallel job

If you have a parallel job you can get more total memory by requesting more nodes. Depending on the characteristics of your code you may also need to run fewer processes per node.

Here’s an example. Suppose your job on Oakley includes the following lines:

#PBS -l nodes=5:ppn=12
…
mpiexec mycode

This job uses 5 nodes, so it has 5*48=240GB total memory available to it. The mpiexec command by default runs one process per core, which in this case is 5*12=60 copies of mycode.

If this job uses too much memory you can spread those 60 processes over more nodes. The following lines request 10 nodes, giving you a total of 10*48=480GB total memory. The -ppn 6 option on the mpiexec command says to run 6 processes per node instead of 12, for a total of 60 as before.

#PBS -l nodes=10:ppn=12
…
mpiexec -ppn 6 mycode

Since parallel jobs are always assigned whole nodes, the following lines will also run 6 processes per node on 10 nodes.

#PBS -l nodes=10:ppn=6
…
mpiexec mycode

Request large-memory nodes

Oakley has eight nodes with 192GB each, four times the memory of a standard node. Oakley also has one huge-memory node with 1TB of memory; it has 32 cores.

Since there are so few of these nodes, compared to hundreds of standard nodes, jobs requesting them will often have a long wait in the queue. The wait will be worthwhile, though, If these nodes solve your memory problem.

To use the large-memory nodes on Oakley, request between 48gb and 192gb memory and 1 to 12 processors per node. Remember to request a number of processors per node proportional to your memory requirements. In most cases you’ll want to request the whole node (ppn=12). You can request up to 8 nodes but the more you request the longer your queue wait is likely to be.

Example:

#PBS -l nodes=1:ppn=12
#PBS -l mem=192gb
…

To use the huge-memory node on Oakley you must request the whole node (ppn=32). Let the memory default.

#PBS -l nodes=1:ppn=32
…

Put a virtual memory limit on your job

The sections above are intended to help you get your job running correctly. This section is about forcing your job to fail gracefully if it consumes too much memory. If your memory usage is unpredictable, it is preferable to terminate the job when it exceeds a memory usage limit rather than allow it to crowd other jobs or crash a node.

The memory limit enforced by PBS is ineffective because it only limits physical memory usage (resident set size or RSS). When your job reaches its memory limit it simply starts using virtual memory, or swap. PBS allows you to put a limit on virtual memory, but that has problems also.

We will use Linux terminology. Each process has several virtual memory values associated with it. VmSize is virtual memory size; VmRSS is resident set size, or physical memory used; VmSwap is swap space used. The number we care about is the total memory used by the process, which is VmRSS + VmSwap. What PBS allows a job to limit is VmRSS (using -l mem=xxx) or VmSize (using -l vmem=xxx).

The relationship among VmSize, VmRSS, and VmSwap is:  VmSize >= VmRSS+VmSwap. For many programs this bound is fairly tight; for others VmSize can be much larger than the memory actually used.

If the bound is reasonably tight, -l vmem=4gb provides an effective mechanism for limiting memory usage to 4gb (for example). If the bound is not tight, VmSize may prevent the program from starting even if VmRSS+VmSwap would have been perfectly reasonable. Java and some FORTRAN 77 programs in particular have this problem.

The vmem limit in PBS is for the entire job, not just one node, so it isn’t useful with parallel (multimode) jobs. PBS also has a per-process virtual memory limit, pvmem. This limit is trickier to use, but it can be useful in some cases.

Here are suggestions for some specific cases.

Serial (single-node) job using program written in C/C++

This case applies to programs written in any language if VmSize is not much larger than VmRSS+VmSwap. If your program doesn’t use any swap space, this means that vmem as reported by qstat -f or the ja command (see below) is not much larger mem as reported by the same tools.

Set the vmem limit equal to, or slightly larger than, the number of processors requested (ppn) times the memory available per processor. Example for Oakley:

#PBS -l nodes=1:ppn=1
#PBS -l vmem=4gb
Parallel (multinode) job using program written in C/C++

This suggestion applies if your processes use approximately equal amounts of memory. See also the comments about other languages under the previous case.

Set the pvmem limit equal to, or slightly larger than, the amount of physical memory on the node divided by the number of processes per node. Example for Oakley, running 12 processes per node:

#PBS -l nodes=5:ppn=12
#PBS -l pvmem=4gb
…
mpiexec mycode
Serial (single-node) job using program written in Java

I’ve only slightly tested this suggestion so far, so please provide feedback to judithg@osc.edu.

Start Java with a virtual memory limit equal to, or slightly larger than, the number of processors requested (ppn) times the memory available per processor. Example for Oakley:

#PBS -l nodes=1:ppn=1
#PBS -l vmem=4gb
…
java -Xms4096m -Xmx4096m MyJavaCode
Other situations

If you have other situations that aren’t covered here, please share them. Contact judithg@osc.edu.

How to monitor your memory usage

qstat -f

While your job is running the command qstat -f jobid will tell you the peak physical and virtual memory usage of the job so far. For a parallel job, these numbers are the aggregate usage across all nodes of the job. The values reported by qstat may lag the true values by a couple of minutes.

free

For parallel (multinode) jobs you can check your per-node memory usage while your job is running by using pdsh -j jobid free -mo on Oakley.

ja

You can put the command ja (job accounting) at the end of your batch script to capture the resource usage reported by qstat -f. The information will be written to your job output log, job_name.o123456.

OnDemand

You can also view node status graphically the OSC OnDemand Portal (ondemand.osc.edu).  Under "Jobs" select "Active Jobs". Click on "Job Status" and scroll down to see memory usage. This shows the total memory usage for the node; if your job is not the only one running there, it may be hard to interpret.

Below is a typical graph for jobs using too much memory. It shows two jobs that ran back-to-back on the same node. The first peak is a job that used all the available physical memory (blue) and a large amount of swap (purple). It completed successfully without crashing the node. The second job followed the same pattern but actually crashed the node.

Notes

If it appears that your job is close to crashing a node, we may preemptively delete the job.

If your job is interfering with other jobs by using more memory than it should be, we may delete the job.

In extreme cases OSC staff may restrict your ability to submit jobs. If you crash a large number of nodes or continue to submit problem jobs after we have notified you of the situation, this may be the only way to protect the system and our other users. If this happens, we will restore your privileges as soon as you demonstrate that you have resolved the problem.

For details on retrieving files from unexpectedly terminated jobs see this FAQ.

For assistance

OSC has staff available to help you resolve your memory issues. See our Support Services page for contact information.

System Email

Occasionally, jobs that experience problems may generate emails from staff or automated systems at the center with some information about the nature of the problem. These pages provide additional information about the various emails sent, and steps that can be taken to address the problem.

Batch job aborted

Purpose

Notify you when your job terminates abnormally.

Sample subject line

PBS JOB 944666.oak-batch.osc.edu

Apparent sender

  • root <adm@oak-batch.osc.edu> (Oakley)
  • root <pbs-opt@hpc.osc.edu> (Glenn)

Sample contents

PBS Job Id: 935619.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/5
Aborted by PBS Server
Job exceeded some resource limit (walltime, mem, etc.). Job was aborted See Administrator for help

Sent under these circumstances

These are fully automated emails send by the batch system.

Some reasons a job might terminate abnormally:

  • The job exceeded its allotted walltime, memory, virtual memory, or other limited resource. More information is available in your job log file, e.g., jobname.o123456.
  • An unexpected system problem caused your job to fail.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

For advice on monitoring and controlling resource usage, see Monitoring and Managing Your Job.

There’s not much you can do about system failures, which fortunately are rare.

Notes

Under some circumstances you can retrieve your job output log if your job aborts due to a system failure. Contact oschelp@osc.edu for assistance.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Batch job begin or end

Purpose

Notify you when your job begins or ends.

Sample subject line

PBS JOB 944666.oak-batch.osc.edu

Apparent sender

  • root <adm@oak-batch.osc.edu> (Oakley)
  • root <pbs-opt@hpc.osc.edu> (Glenn)

Sample contents

PBS Job Id: 944666.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/1
Begun execution
 
PBS Job Id: 944666.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/1
Execution terminated
Exit_status=0
resources_used.cput=00:00:00
resources_used.mem=2228kb
resources_used.vmem=211324kb
resources_used.walltime=00:01:00

Sent under these circumstances

These are fully automated emails sent by the batch system. You control them through the headers in your job script. The following line requests emails at the beginning, ending, and abnormal termination of your job.

#PBS -m abe

To turn off the emails

Remove the -m option from your script and/or command line or use -m n. See PBS Directives Summary.

Notes

You can add the following command at the end of your script to have resource information written to your job output log:

ja

For more information

See PBS Directives Summary.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Batch job deleted by an administrator

Purpose

Notify you when your job is deleted by an administrator.

Sample subject line

PBS JOB 9657213.opt-batch.osc.edu

Apparent sender

  • root adm@oak-batch.osc.edu (Oakley)
  • root pbs-opt@hpc.osc.edu (Glenn)

Sample contents

PBS Job Id: 9657213.opt-batch.osc.edu
Job Name:   mailtest.job
job deleted
Job deleted at request of staff@opt-login04.osc.edu Job using too much memory. Contact oschelp@osc.edu.

Sent under these circumstances

These emails are sent automatically, but the administrator can add a note with the reason.

Some reasons a running job might be deleted:

  • The job is using so much memory that it threatens to crash the node it is running on.
  • The job is using more resources than it requested and is interfering with other jobs running on the same node.
  • The job is causing excessive load on some part of the system, typically a network file server.
  • The job is still running at the start of a scheduled downtime.

Some reasons a queued job might be deleted:

  • The job requests non-existent resources.
  • A job apparently intended for Oakley (ppn=12) was submitted on Glenn.
  • The job can never run because it requests combinations of resources that are disallowed by policy.
  • The user’s credentials are blocked on the system the job was submitted on.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

See the Supercomputing FAQ for suggestions on dealing with specific problems.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.

Emails exceeded the expected volume

Purpose

Notify you that we have placed a hold on emails sent to you from the HPC system.

Sample subject line

Emails sent to email address student@buckeyemail.osu.edu in the last hour exceeded the expected volume

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

When a job fails or is deleted by an administrator, the system sends you an email. If this happens with a large number of jobs, it generates a volume of email that may be viewed as spam by your email provider. To avoid having OSC blacklisted, and to avoid overloading your email account, we hold your emails from OSC.

Please note that these held emails will eventually be deleted if you do not contact us.

Sent under these circumstances

These emails are sent automatically when your email usage from OSC is deferred.

To turn off the emails

Turn off emails related to your batch jobs to reduce your overall email volume from OSC. See the -m option on the PBS Directives Summary page.

Notes

To re-enable email you must contact OSC Help.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

 

 

Job failure due to a system hardware problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a hardware problem.

Sample subject line

Failure of job(s) 919137 due to a hardware problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job.

Sent under these circumstances

These emails are sent by a systems administrator after a node crashes.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

Hardware crashes are quite rare and in most cases there’s nothing you can do to prevent them. Certain types of bus errors on Glenn correlate strongly with certain applications (suggesting that they’re not really hardware errors). If you encounter this type of error you may be advised to use Oakley rather than Glenn.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to a system software problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a system software problem.

Sample subject line

Failure of job(s) 919137 due to a system software problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job. Usually the problems are caused by another job running on the node.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

If you request a whole node (nodes=1:ppn=12 on Oakley or nodes=1:ppn=8 on Glenn) your jobs will be less susceptible to problems caused by other jobs. Other than that, be assured that we work hard to keep jobs from interfering with each other.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to exhaustion of physical memory

Purpose

Notify you that one or more of your jobs caused compute nodes to crash with an out-of-memory error.

Sample subject line

Failure of job(s) 933014,933174 at OSC due to exhaustion of physical memory

Apparent sender

OSC Help <oschelp@osc.edu>

Explanation

Your job(s) exhausted both physical memory and swap space during job execution. This failure caused the compute node(s) used by the job(s) to crash, requiring a reboot.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

You cannot turn off these emails. Please don’t ignore them because they report a problem that you must correct.

To prevent these problems

See the Knowledge Base article "Out-of-Memory (OOM) or Excessive Memory Usage" for suggestions on dealing with out-of-memory problems.

For information on the memory available on the various systems, see our Supercomputing page.

Notes

If you continue to submit jobs that cause these problems, your HPC account may be blocked.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.