Batch Limit Rules

Pitzer includes two types of processors, Intel® Xeon® 'Skylake' processor and Intel® Xeon® 'Cascade Lake' processor. This document provides you information on how to request resources based on the requirements of # of cores, memory, etc despite the heterogeneous nature of the Pitzer cluster. Therefore, in some cases, your job can land on either type of processor. Please check this page on how to request resources due to the heterogeneous nature and limit your job to a certain type of processor on Pitzer.
We use Slurm syntax for all the discussions on this page. Please check this page if your script is prepared in PBS syntax. 

Memory limit

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. 

Regular Compute Node

  • For the regular 'Skylake' processor-based node, it has 40 cores/node. The physical memory equates to 4.8 GB/core or 192 GB/node; while the usable memory equates to 4,556MB/core or 182,240MB/node (177.96GB/node).
  • For the regular 'Cascade Lake' processor-based node, it has 48 cores/node. The physical memory equates to 4.0 GB/core or 192 GB/node; while the usable memory equates to 3,797MB/core or 182,256MB/node (177.98GB/node). 

Jobs requesting no more than 1 node

If your job requests less than a full node, it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4,556 MB/core or 3,797MB/core depending on which node your job lands on).  For example, without any memory request ( mem=XX ),

  • A job that requests --ntasks=1  and lands on 'Skylake' node will be assigned one core and should use no more than 4556 MB of RAM; a job that requests --ntasks=1  and lands on 'Cascade Lake' node will be assigned one core and should use no more than 3797 MB of RAM
  • A job that requests --ntasks=3  and lands on 'Skylake' node will be assigned 3 cores and should use no more than 3*4556 MB of RAM; a job that requests --ntasks=3  and lands on 'Cascade Lake' node will be assigned 3 cores and should use no more than 3*3797 MB of RAM
  • A job that requests  --ntasks=40  and lands on 'Skylake' node will be assigned the whole node (40 cores) with 178 GB of RAM; a job that requests --ntasks=40  and lands on 'Cascade Lake' node will be assigned 40 cores (partial node) and should use no more than 40* 3797 MB of RAM.  
Please be careful if you include memory request ( mem=XX ) in your job.

There is a known bug when using both --ntasks-per-node and --mem in the request so we suggest using only --mem or the combination of --ntasks and --mem (see this page for more information on the bug). 

  • A job that requests --ntasks=1 --mem=16000MB  and lands on 'Skylake' node will be assigned 4 cores and have access to 16000 MB of RAM, and charged for 4 cores worth of usage; a job that requests --ntasks=1 --mem=16000MB  and lands on 'Cascade Lake' node will be assigned 5 cores and have access to 16000 MB of RAM, and charged for 5 cores worth of usage
  • A job that requests --ntasks=8 --mem=16000MB  and lands on 'Skylake' node will be assigned 8 cores but have access to only 16000 MB of RAM , and charged for 8 cores worth of usage; a job that requests --ntasks=8 --mem=16000MB  and lands on 'Cascade Lake' node will be assigned 8 cores but have access to only 16000 MB of RAM , and charged for 8 cores worth of usage

Jobs requesting more than 1 node

A multi-node job ( --nodes > 1 ) will be assigned the entire nodes and charged for the entire nodes regardless of --ntasks or --ntasks-per-node request. For example, a job that requests --nodes=10 --ntasks-per-node=1  and lands on 'Skylake' node will be charged for 10 whole nodes (40 cores/node*10 nodes, which is 400 cores worth of usage); a job that requests --nodes=10 --ntasks-per-node=1  and lands on 'Cascade Lake' node will be charged for 10 whole nodes (48 cores/node*10 nodes, which is 480 cores worth of usage). We usually suggest not including --ntasks-per-node and using --ntasks if needed.   

Large Memory Node

On Pitzer, it has 48 cores per node. The physical memory equates to 16.0 GB/core or 768 GB/node; while the usable memory equates to 15,872MB/core or 761,856MB/node (744GB/node).

For any job that requests no less than 363 GB/node but no more than 744 GB/node, the job will be scheduled on the large memory node.To request no more than a full large memory node, you need to specify the memory request between 363GB and 744GB, i.e.,  363GB <= mem <=744GB. --mem is the total memory per node allocated to the job. You can request a partial large memory node, so consider your request more carefully when you plan to use a large memory node, and specify the memory based on what you will use. 

Huge Memory Node

On Pitzer, it has 80 cores per node. The physical memory equates to 37.5 GB/core or 3 TB/node; while the usable memory equates to 38,259MB/core or  3,060,720MB/node (2988.98GB/node).

 

To request no more than a full huge memory node, you have two options:

  • The first is to specify the memory request between 744GB and 2988GB, i.e., 744GB < mem <=2988GB).
  • The other option is to use the combination of --ntasks-per-node and --partition, like --ntasks-per-node=4 --partition=hugemem . When no memory is specified for the huge memory node, your job is entitled to a memory allocation proportional to the number of cores requested (38,259MB/core). Note, --ntasks-per-node should be no less than 20 and no more than 80 

Summary

In summary, for serial jobs, we will allocate the resources considering both the # of cores and the memory request. For parallel jobs (nodes>1), we will allocate the entire nodes with the whole memory regardless of other requests. Check requesting resources on pitzer for information about the usable memory of different types of nodes on Pitzer. To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

GPU Jobs

Dual GPU Node

  • For the dual GPU node with 'Skylake' processor, it has 40 cores/node. The physical memory equates to 9.6 GB/core or 384 GB/node; while the usable memory equates to 9292 MB/core or 363 GB/node. Each node has 2 NVIDIA Volta V100 w/ 16GB GPU memory. 
  • For the dual GPU node with 'Cascade Lake' processor, it has 48 cores/node. The physical memory equates to 8.0 GB/core or 384 GB/node; while the usable memory equates to 7744 MB/core or 363 GB/node. Each node has 2 NVIDIA Volta V100 w/32GB GPU memory.  

For serial jobs, we will allow node sharing on GPU nodes so a job may request either 1 or 2 GPUs (--ntasks=XX --gpus-per-node=1 or --ntasks=XX --gpus-per-node=2)

For parallel jobs (nodes>1), we will not allow node sharing. A job may request 1 or 2 GPUs ( gpus-per-node=1 or gpus-per-node=2 ) but both GPUs will be allocated to the job.

Quad GPU Node

For quad GPU node, it has 48 cores/node. The physical memory equates to 16.0 GB/core or 768 GB/node; while the usable memory equates to 15,872 MB/core or 744 GB/node.. Each node has 4 NVIDIA Volta V100s w/32GB GPU memory and NVLink.

For serial jobs, we will allow node sharing on GPU nodes, so a job can land on a quad GPU node if it requests 3-4 GPUs per node (--ntasks=XX --gpus-per-node=3 or --ntasks=XX --gpus-per-node=4), or requests quad GPU node explicitly with using --gpus-per-node=v100-quad:G, or gets backfilled with requesting 1-2 GPUs per node with less than 4 hours long. 

For parallel jobs (nodes>1), only up to 2 quad GPU nodes can be requested in a single job. We will not allow node sharing and all GPUs will be allocated to the job.

Walltime and node limits

Here is the walltime and node limits per job for different queues/partitions available on Pitzer:

NAME

MAX WALLTIME

MIN JOB SIZE

MAX JOB SIZE

NOTES

Serial

168 hours

1 core

1 node

 

Longserial

336 hours

1 core

1 node

  • Restricted access 
  • To request it, add --partition=longserial

Parallel

96 hours

2 nodes

40 nodes 

 

Hugemem

168 hours

1 core

1 node

 

Largemem

168 hours

1 core

1 node

 

GPU serial

168 hours

1 core

1 node

Includes dual and quad GPU nodes

GPU parallel

96 hours

2 nodes

10 nodes

  • Includes and quad GPU nodes 
  • Only up to 2 quad GPU nodes can be requested in a single job

Debug-regular

1 hour

1 core

2 nodes

To request it, add --partition=debug

Debug-GPU

1 hour

1 core

2 nodes

To request it, add --partition=gpudebug

Total available nodes shown for pitzer may fluctuate depending on the amount of currently operational nodes and nodes reserved for specific projects.

To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if the performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

    Job/Core Limits

    Max Running Job Limit  Max Core/Processor Limit
      For all types GPU jobs Regular debug jobs GPU debug jobs For all types
    Individual User 256 140 4 4 2160
    Project/Group 384 140 n/a n/a 2160

     

    An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use. However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

    A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately.
    Supercomputer: 
    Service: