Knowledge Base

This knowledge base is a collection of important, useful information about OSC systems that does not fit into a guide or tutorial, and is too long to be answered in a simple FAQ.

Changes of Default Memory Limits

Problem Description

Our current GPFS file system is a distributed process with significant interactions between the clients. As the compute nodes being GPFS flle system clients, a certain amount of memory of each node needs to be reserved for these interactions. As a result, the maximum physical memory of each node allowed to be used by users' jobs are reduced, in order to keep the healthy performance of the file system. In addition, using swap memory is not allowed anymore. 

The table below summarizes the maximum physical memory allowed for each type of nodes on our systems:

Oakley Cluster

Node type physical memory per node Maximum memory allowed per Node 
Regular node 48GB 45GB
Big memory node 192GB 187GB
Huge memory node 1024GB (1TB) 1008GB

Ruby Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 64GB 61GB
Debug node 128GB 124GB
Huge memory node 1024GB (1TB) 1008GB

Owens Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 128GB 124GB
Huge memory node 1536GB

1510GB

Solutions When You Need Regular Nodes

Starting from October 27, 2016, we'll implement a new scheduling policy on all of our clusters, reflecting the reduced default memory limits. 

If you do not request memory explicitly in your job (no -l mem

Your job can be submitted and scheduled as before, and resouces will be allocated according to your requests of cores/nodes ( nodes=XX:ppn=XX ).  If you request partial node, the memory allocated to your job is proportional to the number of cores requested (4GB/core on Oakley and Owens); if you request the whole node, the memory allocated to your job is decreased, following the information summarized in the above tables. Some examples are provided below.

A request of partial node:

On Oakley, a request of nodes=1:ppn=1  will be allocated with 4GB memory, and charged for 1 core.  A request of  nodes=1:ppn=4  will be allocated with 16GB memory, and charged for 4 cores. A request of  nodes=1:ppn=11  will be allocated with 44GB memory, and charged for 11 cores. 

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.  

On Owens, a request of  nodes=1:ppn=1   will be allocated with 4GB memory, and charged for 1 core. A request of  nodes=1:ppn=4  will be allocated with 16GB memory, and charged for 4 cores.

A request of the whole node:

A request of the whole regular node will be allocated with maximum memory allowed per node and charged for the whole node, as summarized below:

  Request memory allocated charged for
Oakley nodes=1:ppn=12  45GB 12 cores
Ruby nodes=1:ppn=20  61GB 20 cores
Owens nodes=1:ppn=28 124GB 28 cores

A request of multiple nodes:

If you have a multi-node job (  nodes>1  ), your job will be assigned the entire nodes with maximum memory allowed per node (45GB on Oakley, 61GB for Ruby, and 124GB for Owens) and charged for the entire nodes regardless of ppn request.

If you do request memory explicitly in your job (with  -l mem 

If you request memory explicily in your scirpt, please re-visit your script according to the following information. 

A request of partial node:

On Oakley, a request of  nodes=1:ppn=1,mem=4gb   will be allocated with 4GB memory, and charged for 1 core; a request of  nodes=1:ppn=2,mem=8gb   will be allocated with 8GB memory, and charged for 2 cores; a request of  nodes=1:ppn=1,mem=40gb   will be allocated with 40GB memory, and charged for 10 cores.

On Owens, a request of  nodes=1:ppn=1, mem=4gb   will be allocated with 4GB memory, and charged for 1 core.

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job. 

 A request of the whole node:

On Oakley, the maximum value you can use for -l mem is 45gb, i.e. -l mem=45gb. A request of  nodes=1:ppn=12,mem=45gb will be allocated with 45GB memory, and charged for the whole node. If you need more than 45GB memory for the job, please submit your job to big/huge memory nodes on Oakley, or switch to Owens cluster. Any request requesting mem>45gb will not be scheduled. 

On Ruby, the maximum value you can use for -l mem is 61gb, i.e. -l mem=61gb. A request of  nodes=1:ppn=20,mem=61gb will be allocated with 61GB memory, and charged for the whole node. If you need more than 61GB memory for the job, please submit your job to huge memory nodes on Ruby, or switch to Owens cluster. Any request requesting mem>61gb will not be scheduled. 

On Owens, the maximum value you can use for -l mem is 125gb, i.e. -l mem=125gb. A request of  nodes=1:ppn=28,mem=124gb will be allocated with 124GB memory, and charged for the whole node. If you need more than 124GB memory for the job, please submit your job to huge memory nodes. Any request requesting mem=>126gb will not be scheduled. 

A request of multiple nodes:

If you have a multi-node job (   nodes>1), your job will be assigned the entire nodes with maximum memory allowed per node (45GB on Oakley, 61GB for Ruby, and 124GB for Owens) and charged for the entire nodes.

Solutions When You Need Special Nodes

It is highly recommended that you do not put any memory request and follow the syntax below if you need any special resources.

 Oakley Cluster:

node type how to request MEMORY ALLOCATED CHARGED FOR
Big memory node

nodes=XX:ppn=12:bigmem

(XX can be 1-8)

187GB 12 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Ruby Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Debug node nodes=1:ppn=16 -q debug 124GB 16 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Owens Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=48 1510GB 48 cores

 

Supercomputer: 

Compilation Guide

As a general recommendation, we suggest selecting the newest compilers available for a new project. For repeatability, you may not want to change compilers in the middle of an experiment.

Owens Compilers

The Haswell and Broadwell processors that make up Owens support the Advanced Vector Extensions (AVX2) instruction set, but you must set the correct compiler flags to take advantage of it. AVX2 has the potential to speed up your code by a factor of 4 or more, depending on the compiler and options you would otherwise use.

With the Intel compilers, use -xHost and -O2 or higher. With the gnu compilers, use -march=native and -O3. The PGI compilers by default use the highest available instruction set, so no additional flags are necessary.

This advice assumes that you are building and running your code on Owens. The executables will not be portable.

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The   -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above:  -qopenmp  or  -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran   or   pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx

Recommended Optimization Options

The   -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above:  -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The  -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above:  -fopenmp

 

Ruby Compilers

Intel (recommended)

  NON-MPI MPI
FORTRAN 90 ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

The  -O2 -xHost  options are recommended with the Intel compilers. (For more options, see the "man" pages for the compilers.

OpenMP

Add this flag to any of the above: -qopenmp or -openmp

PGI

  NON-MPI MPI
FORTRAN 90 pgfortran  or  pgf90 mpif90
C pgcc mpicc
C++ pgc++ mpicxx
NOTE: The C++ compiler used to be pgCC, but newer versions of PGI do not support this name.

Recommended Optimization Options

The  -fast  option is appropriate with all PGI compilers. (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above: -mp

GNU

  NON-MPI MPI
FORTRAN 90 gfortran mpif90
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The -O2 -march=native  options are recommended with the GNU compilers. (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above: -fopenmp

 

Oakley Compilers

Intel (Recommended)

  non-MPI MPI
Fortran ifort mpif90
C icc mpicc
C++ icpc mpicxx

Recommended Optimization Options

Sequential (not numerically sensitive) -fast
Sequential (numerically sensitive) -ipo -O2 -static -xHost
MPI (not numerically sensitive) -ipo -O3 -no-prec-div -xHost
MPI (numerically sensitive) -ipo -O2 -xHost
Note:  The -fast flag is equivalent to -ipo -O3 -no-prec-div -static -xHost .
Note:  Other options are available for code with extreme numerical sensitivity; their description is beyond the scope of this guide.
Note:  Intel 14.0.0.080 has a bug related to generation of portable code. Add the flag -msse3  to get around it.

OpenMP

Add this flag to any of the above: -qopenmp or -openmp

PGI

  non-MPI MPI
Fortran 90 or 95 pgfortran or pgf90 mpif90
Fortran 77 pgf77 mpif77
C pgcc mpicc
C++ pgc++ mpicxx

NOTE: The C++ compiler used to be pgCC, but newer versions of PGI do not support this name.

Recommended Optimization Options

The -fast  option is appropriate with all PGI compilers.  (For more options, see the "man" pages for the compilers)

Note: The PGI compilers can generate code for accelerators such as GPUs. Description of these capabilities is beyond the scope of this guide.

OpenMP

Add this flag to any of the above: -mp

GNU

  non-MPI MPI
Fortran 90 or 95 gfortran mpif90
Fortran 77 g77 mpif77
C gcc mpicc
C++ g++ mpicxx

Recommended Optimization Options

The -O3 -march=native options are recommended with the GNU compilers.  (For more options, see the "man" pages for the compilers)

OpenMP

Add this flag to any of the above (except g77 and mpif77): -fopenmp

Further Reading:

Intel Compiler Page

PGI Compiler Page

GNU Complier Page

Supercomputer: 
Technologies: 
Fields of Science: 

Firewall and Proxy Settings

Connections to OSC

In order for users to access OSC resources through the web your firewall rules should allow for connections to the following IP ranges.  Otherwise, users may be blocked or denied access to our services.

  • 192.148.248.0/24
  • 192.148.247.0/24
  • 192.157.5.0/25

The followingg TCP ports should be opened:

  • 80 (HTTP)
  • 443 (HTTPS)
  • 22 (SSH)

The following domain should be allowed:

  • *.osc.edu

Users who are unsure of whether their network is blocking theses hosts or ports should contact their local IT administrator.

Connections from OSC

All outbound network traffic from all of OSC's compute nodes are routed through a network address translation host (NAT), or two backup servers:

  • nat.osc.edu (192.157.5.13)
  • 192.148.248.35
  • 192.148.248.186

IT and Network Administrators

Please use the above information in order to assit users in acessing our resources.  

Occasionally new services may be stood up using hosts and ports not described here.  If you believe our list needs correcting please let us know at oschelp@osc.edu.

Supercomputer: 
Service: 

Messages from qsub

We have been adding some output from qsub that should aid you in creating better job scripts. We've documented the various messages here.

NOTE

A "NOTE" message is informational; your job has been submitted, but qsub made some assumptions about your job that you may not have intended.

No account/project specified

Your job did not specify a project to charge against, but qsub was able to select one for you. Typically, this will be because your username can only charge against one project, but it may be because you specified a preference by setting the OSC_DEFAULT_ACCOUNT environment variable. The output should indicate which project was assumed to be the correct one; if it was not correct, you should delete the job and resubmit after setting the correct job in the job script using the -A flag. For example:

#PBS -A PZS0530

Replace PZS0530 with the correct project code. Explicitly setting the -A flag will cause this informational message to not appear.

No memory limit set

Your job did not specify an explicit memory limit. Since we limit access to memory based on the number of cores set, qsub set this limit on your behalf, and will have mentioned in the message what the memory limit was set to.

You can suppress this informational message by explicitly setting the memory limit. For example:

#PBS -l mem=4gb

Please remember that the memory to core ratios are different on each cluster we operate. Please review the main documentation page for the cluster you are using for more information.

ERROR

A "ERROR" message indicates that your job was not submitted to the queue. Typically, this is because qsub is unsure of how to resolve an ambiguous setting in your job parameters. You will need to fix the problem in your job script, and resubmit.

You have not specified an account and have more than one available

Your username has the ability to charge jobs to more than one project, and qsub is unable to determine which one this job should be charged against. You can fix this by specifying the project using the -A flag. For example, you should add this line to your job script:

#PBS -A PZS0530

If you get this error, qsub will inform you of which projects you can charge against. Select the appropriate project, and replace "PZS0530" in the example above with the correct code.

You have the ability to tell qsub which project should be charged if no charge code is specified in the job script by setting the OSC_DEFAULT_ACCOUNT environment variable. For example, if you use the "bash" shell, you could put the line export OSC_DEFAULT_ACCOUNT=PZS0530, again replacing PZS0530 with the correct project code.

Supercomputer: 
Service: 

Migrating jobs from Glenn to Oakley or Ruby

This page includes a summary of differences to keep in mind when migrating jobs from Glenn to one of our other clusters.

Hardware

Most Oakley nodes have 12 cores and 48GB memory. There are eight large-memory nodes with 12 cores and 192GB memory, and one huge-memory node with 32 cores and 1TB of memory. Most Ruby nodes have 20 cores and 64GB of memory. There is one huge-memory node with 32 cores and 1TB of memory. By contrast, most Glenn nodes have 8 cores and 24GB memory, with eight nodes having 16 cores and 64GB memory.

Module System

Oakley and Ruby use a different module system than Glenn. It looks very similar, but it enforces module dependencies, and thus may prevent certain module combinations from being loaded that were permitted on Glenn. For example, only one compiler may be loaded at a time.

module avail will only show modules compatible with your currently loaded modules, but not all installed modules on the system. To see all modules on the cluster, use the command module spider. Both module avail and module spider can take a partial module name as a search parameter, such as module spider dyna.

Version numbers are indicated with a slash “/” rather than a dash “-” and need not be specified if you want the default version.

Compilers

Like Glenn, Oakley and Ruby support three compilers: Intel, PGI, and gnu. Unlike Glenn, Oakley and Ruby only let you have one compiler module loaded at any one time. The default is Intel. To switch to a different compiler, use module swap intel gnu or module swap intel pgi.

Important note: The gnu compilers are part of the Linux distribution, so they’re always available. It’s important to use the gnu module, however, to link with the correct libraries for MVAPICH, MKL, etc.

MPI

MPI-2 is available on Oakley and Ruby through the MVAPICH2 modules. The MVAPICH2 libraries are linked differently than on Glenn, requiring you to have the correct compiler and MVAPICH2 modules loaded at execution time as well as at compile time. (This doesn’t apply if you’re using a software package that was installed by OSC.)

Software you build and/or install

If your software uses any libraries installed by OSC, including MVAPICH, you will have to rebuild it. If you link to certain libraries, including MVAPICH, MKL, and others, you must have the same compiler module loaded at run time that you do at build time. Please refer to the compilation guide in our Knowledge Base for guidance on optimizing your compilations for our hardware.

OSC installed software

Most of the software installed on Glenn is also installed on Oakley or Ruby, although old versions may no longer be available. We recommend migrating to a newer version of the application if at all possible. Please review the software documentation to see what versions are available, and examine sample batch scripts.

Accounting

All OSC clusters currently use the same core-hour to RU conversion factor. Oakley will charge you for the number of cores proportional to the amount of memory your job requests, while Ruby only accepts full-node jobs. Please review the system documentation for each cluster.

“all” replaced by “pdsh”

The “all” command is not available on Oakley or Ruby; “pdsh” is available on all clusters.

pdsh –j jobid command

pdsh –g feature command

pdsh –w nodelist command

Supercomputer: 
Service: 

System Email

Occasionally, jobs that experience problems may generate emails from staff or automated systems at the center with some information about the nature of the problem. These pages provide additional information about the various emails sent, and steps that can be taken to address the problem.

Batch job aborted

Purpose

Notify you when your job terminates abnormally.

Sample subject line

PBS JOB 944666.oak-batch.osc.edu

Apparent sender

  • root <adm@oak-batch.osc.edu> (Oakley)
  • root <pbs-opt@hpc.osc.edu> (Glenn)

Sample contents

PBS Job Id: 935619.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/5
Aborted by PBS Server
Job exceeded some resource limit (walltime, mem, etc.). Job was aborted See Administrator for help

Sent under these circumstances

These are fully automated emails send by the batch system.

Some reasons a job might terminate abnormally:

  • The job exceeded its allotted walltime, memory, virtual memory, or other limited resource. More information is available in your job log file, e.g., jobname.o123456.
  • An unexpected system problem caused your job to fail.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

For advice on monitoring and controlling resource usage, see Monitoring and Managing Your Job.

There’s not much you can do about system failures, which fortunately are rare.

Notes

Under some circumstances you can retrieve your job output log if your job aborts due to a system failure. Contact oschelp@osc.edu for assistance.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Batch job begin or end

Purpose

Notify you when your job begins or ends.

Sample subject line

PBS JOB 944666.oak-batch.osc.edu

Apparent sender

  • root <adm@oak-batch.osc.edu> (Oakley)
  • root <pbs-opt@hpc.osc.edu> (Glenn)

Sample contents

PBS Job Id: 944666.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/1
Begun execution
 
PBS Job Id: 944666.oak-batch.osc.edu
Job Name:   mailtest.job
Exec host:  n0587/1
Execution terminated
Exit_status=0
resources_used.cput=00:00:00
resources_used.mem=2228kb
resources_used.vmem=211324kb
resources_used.walltime=00:01:00

Sent under these circumstances

These are fully automated emails sent by the batch system. You control them through the headers in your job script. The following line requests emails at the beginning, ending, and abnormal termination of your job.

#PBS -m abe

To turn off the emails

Remove the -m option from your script and/or command line or use -m n. See PBS Directives Summary.

Notes

You can add the following command at the end of your script to have resource information written to your job output log:

ja

For more information

See PBS Directives Summary.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Batch job deleted by an administrator

Purpose

Notify you when your job is deleted by an administrator.

Sample subject line

PBS JOB 9657213.opt-batch.osc.edu

Apparent sender

  • root adm@oak-batch.osc.edu (Oakley)
  • root pbs-opt@hpc.osc.edu (Glenn)

Sample contents

PBS Job Id: 9657213.opt-batch.osc.edu
Job Name:   mailtest.job
job deleted
Job deleted at request of staff@opt-login04.osc.edu Job using too much memory. Contact oschelp@osc.edu.

Sent under these circumstances

These emails are sent automatically, but the administrator can add a note with the reason.

Some reasons a running job might be deleted:

  • The job is using so much memory that it threatens to crash the node it is running on.
  • The job is using more resources than it requested and is interfering with other jobs running on the same node.
  • The job is causing excessive load on some part of the system, typically a network file server.
  • The job is still running at the start of a scheduled downtime.

Some reasons a queued job might be deleted:

  • The job requests non-existent resources.
  • A job apparently intended for Oakley (ppn=12) was submitted on Glenn.
  • The job can never run because it requests combinations of resources that are disallowed by policy.
  • The user’s credentials are blocked on the system the job was submitted on.

To turn off the emails

There is no way to turn them off at this time.

To prevent these problems

See the Supercomputing FAQ for suggestions on dealing with specific problems.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.

Emails exceeded the expected volume

Purpose

Notify you that we have placed a hold on emails sent to you from the HPC system.

Sample subject line

Emails sent to email address student@buckeyemail.osu.edu in the last hour exceeded the expected volume

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

When a job fails or is deleted by an administrator, the system sends you an email. If this happens with a large number of jobs, it generates a volume of email that may be viewed as spam by your email provider. To avoid having OSC blacklisted, and to avoid overloading your email account, we hold your emails from OSC.

Please note that these held emails will eventually be deleted if you do not contact us.

Sent under these circumstances

These emails are sent automatically when your email usage from OSC is deferred.

To turn off the emails

Turn off emails related to your batch jobs to reduce your overall email volume from OSC. See the -m option on the PBS Directives Summary page.

Notes

To re-enable email you must contact OSC Help.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

 

 

File system load problem

Purpose

Notify you that one or more of your jobs caused excessive load on one of the network file system directory servers.

Sample subject line

Your jobs on Oakley are causing excessive load on fs14

Apparent sender

OSC Help <OSCHelp@osc.edu> or an individual staff member

Explanation

Your jobs are causing problems with one of the network file servers. This is usually caused by submitting a large number of jobs that start at the same time and execute in lockstep.

Sent under these circumstances

These emails are sent by a staff member when the high load is traced to your jobs. Often the jobs have to be stopped or deleted.

To turn off the emails

You cannot turn off these emails. Please don’t ignore them because they report a problem that you must correct.

To prevent these problems

See the Knowledge Base article (coming soon) for suggestions on dealing with file system load problems.

For information on the different file systems available at OSC, see Available File Systems.

Notes

If you continue to submit jobs that cause these problems, your HPC account may be blocked.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.

Job failure due to a system hardware problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a hardware problem.

Sample subject line

Failure of job(s) 919137 due to a hardware problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job.

Sent under these circumstances

These emails are sent by a systems administrator after a node crashes.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

Hardware crashes are quite rare and in most cases there’s nothing you can do to prevent them. Certain types of bus errors on Glenn correlate strongly with certain applications (suggesting that they’re not really hardware errors). If you encounter this type of error you may be advised to use Oakley rather than Glenn.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to a system software problem

Purpose

Notify you that one or more of your jobs was running on a compute node that crashed due to a system software problem.

Sample subject line

Failure of job(s) 919137 due to a system software problem at OSC

Apparent sender

OSC Help <OSCHelp@osc.edu>

Explanation

Your job failed and was not at fault. You should resubmit the job. Usually the problems are caused by another job running on the node.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

We don’t have a mechanism to turn off these emails. If they really bother you, contact OSC Help and we’ll try to accommodate you.

To prevent these problems

If you request a whole node (nodes=1:ppn=12 on Oakley or nodes=1:ppn=8 on Glenn) your jobs will be less susceptible to problems caused by other jobs. Other than that, be assured that we work hard to keep jobs from interfering with each other.

For assistance

Contact OSC Help. See our Support Services page for more contact information.

Job failure due to exhaustion of physical memory

Purpose

Notify you that one or more of your jobs caused compute nodes to crash with an out-of-memory error.

Sample subject line

Failure of job(s) 933014,933174 at OSC due to exhaustion of physical memory

Apparent sender

OSC Help <oschelp@osc.edu>

Explanation

Your job(s) exhausted both physical memory and swap space during job execution. This failure caused the compute node(s) used by the job(s) to crash, requiring a reboot.

Sent under these circumstances

These emails are sent by a systems administrator as part of the node cleanup process.

To turn off the emails

You cannot turn off these emails. Please don’t ignore them because they report a problem that you must correct.

To prevent these problems

See the Knowledge Base article "Out-of-Memory (OOM) or Excessive Memory Usage" for suggestions on dealing with out-of-memory problems.

For information on the memory available on the various systems, see our Supercomputing page.

Notes

If you continue to submit jobs that cause these problems, your HPC account may be blocked.

For assistance

We will work with you to get your jobs running within the constraints of the system. Contact OSC Help for assistance. See our Support Services page for more contact information.