OSC will replace the Ethernet switches in the Owens cluster starting from Dec 14

Batch Limit Rules

Memory Limit:

It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. See Charging for memory use for more details

Regular Compute Node

For regular compute node, the physical memory equates to 4.8 GB/core or 192 GB/node; while the usable memory equates to 4761 MB/core or 183 GB/node. See Changes of Default Memory Limits for more discussions. 

If your job requests less than a full node ( ppn < 40  ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4761 MB/core).  For example, without any memory request ( mem=XX ), a job that requests nodes=1:ppn=1 will be assigned one core and should use no more than 4761 MB of RAM, a job that requests nodes=1:ppn=3 will be assigned 3 cores and should use no more than 14283 MB of RAM, and a job that requests  nodes=1:ppn=40 will be assigned the whole node (40 cores).  

Please be careful if you include memory request ( mem=XX ) in your job. A job that requests   nodes=1:ppn=1,mem=14283mb  will be assigned one core and have access to 14283 MB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests   nodes=1:ppn=5,mem=14283mB   will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of Resource Units (RU).  

A multi-node job ( nodes > 1 ) will be assigned the entire nodes and charged for the entire nodes regardless of ppn request. For example, a job that requests  nodes=10:ppn=1 will be charged for 10 whole nodes (40 cores/node*10 nodes, which is 400 cores worth of RU).  

GPU Node

For GPU node, the physical memory equates to 9.6 GB/core or 384 GB/node; while the memory used by the submit filter equates to 4761 MB/core or 374 GB/node.

Huge Memory Node

Node sharing is not allowed for huge memory node. A job that requests huge-memory node ( nodes=1:ppn=80 ) will be allocated the entire huge-memory node with 3019 GB of RAM and charged for the whole node (80 cores worth of RU).

Summary

In summary, for serial jobs, we will allocate the resources considering both the ppn and memory request if requesting a regular compute or GPU node. For parallel jobs (n>1) or huge memory jobs, we will allocate the entire nodes with the whole memory regardless of ppn request. Below is the summary of the physical and usable memory of different types of nodes on Pitzer. To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage

 
Type of node   Physical Memory Usable Memory
Regular compute Per core 4.8 GB 4761 MB
  Per node 192 GB (40 cores) 183 GB
GPU Per core 9.6 GB 4761 MB
  Per node 384 GB (40 cores) 374 GB
Huge memory Per core 37.5 GB n/a
  Per node 3 TB (80 cores) 3019 GB

GPU Jobs

There are 2 GPUs per node on Pitzer.

For serial jobs, we will allow node sharing on GPU nodes so a job may request any number of cores (up to 40) and either 1 or 2 GPUs ( nodes=1:ppn=XX: gpus=1 or gpus=2 )

For parallel jobs (n>1), we will not allow node sharing. A job may request 1 or 2 GPUs ( gpus=1 or gpus=2 ) but both GPUs will be allocated to the job.

Walltime Limit

Here are the queues available on Pitzer:

Name Max walltime nodes available min job size max job size notes
Serial 168 hours Available minus reservations 1 core 1 node  
Longserial 336 hours Available minus reservations 1 core 1 node Restricted access
Parallel 96 hours Available minus reservations 2 nodes 40 nodes   
Longparallel TBD Available minus reservations 2 nodes TBD Restricted access
Hugemem 48 hours 4 nodes 1 node 1 node  
Parallel hugemem TBD 4 nodes 2 nodes 4 nodes Do not support for now
Debug-regular 1 hour 6 nodes 1 core 2 nodes -q debug
Debug-GPU 1 hour 2 nodes 1 core 2 nodes -q debug

 

Job Limit

An individual user can have up to 128 concurrently running jobs and/or up to 2040 processors/cores (51 nodes, ~22% of the whole system) in use. All the users in a particular group/project can have up to 192 concurrently running jobs and/or up to 2040 processors/cores (51 nodes, ~22% of the whole system) in use.  

A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately. Jobs submitted in excess of this limit will be rejected.

Supercomputer: 
Service: