It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs. See Charging for memory use for more details.
Regular Compute Node
For regular compute node, the physical memory equates to 4.8 GB/core or 192 GB/node; while the usable memory equates to 4761 MB/core or 183 GB/node. See Changes of Default Memory Limits for more discussions.
If your job requests less than a full node (
ppn < 40 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4761 MB/core). For example, without any memory request (
mem=XX ), a job that requests
nodes=1:ppn=1 will be assigned one core and should use no more than 4761 MB of RAM, a job that requests
nodes=1:ppn=3 will be assigned 3 cores and should use no more than 14283 MB of RAM, and a job that requests
nodes=1:ppn=40 will be assigned the whole node (40 cores).
Please be careful if you include memory request (
mem=XX ) in your job. A job that requests
nodes=1:ppn=1,mem=14283mb will be assigned one core and have access to 14283 MB of RAM, and charged for 3 cores worth of Resource Units (RU). However, a job that requests
nodes=1:ppn=5,mem=14283mB will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of Resource Units (RU).
A multi-node job (
nodes > 1 ) will be assigned the entire nodes and charged for the entire nodes regardless of ppn request. For example, a job that requests
nodes=10:ppn=1 will be charged for 10 whole nodes (40 cores/node*10 nodes, which is 400 cores worth of RU).
For GPU node, the physical memory equates to 9.6 GB/core or 384 GB/node; while the memory used by the submit filter equates to 4761 MB/core or 374 GB/node.
For any job that requests more than 183 GB/node but no more than 374 GB/node, the job will be scheduled on the GPU node (which is called 'largemem' queue).
Huge Memory Node
Node sharing is not allowed for huge memory node. A job that requests huge-memory node (
nodes=1:ppn=80,mem=3000GB) will be allocated the entire huge-memory node with 3019 GB of RAM and charged for the whole node (80 cores worth of RU).
In summary, for serial jobs, we will allocate the resources considering both the ppn and memory request if requesting a regular compute or GPU node. For parallel jobs (n>1) or huge memory jobs, we will allocate the entire nodes with the whole memory regardless of ppn request. Below is the summary of the physical and usable memory of different types of nodes on Pitzer. To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.
|Type of node||Physical Memory||Usable Memory|
|Regular compute||Per core||4.8 GB||4761 MB|
|Per node||192 GB (40 cores)||183 GB|
|GPU||Per core||9.6 GB||4761 MB|
|Per node||384 GB (40 cores)||374 GB|
|Huge memory||Per core||37.5 GB||n/a|
|Per node||3 TB (80 cores)||3019 GB|
There are 2 GPUs per node on Pitzer.
For serial jobs, we will allow node sharing on GPU nodes so a job may request any number of cores (up to 40) and either 1 or 2 GPUs (
nodes=1:ppn=XX: gpus=1 or gpus=2 )
For parallel jobs (n>1), we will not allow node sharing. A job may request 1 or 2 GPUs (
gpus=1 or gpus=2 ) but both GPUs will be allocated to the job.
Here are the queues available on Pitzer:
|Name||Max walltime||nodes available||min job size||max job size||notes|
|Serial||168 hours||Available minus reservations||1 core||1 node|
|Longserial||336 hours||Available minus reservations||1 core||1 node||Restricted access|
|Parallel||96 hours||Available minus reservations||2 nodes||40 nodes|
|Longparallel||TBD||Available minus reservations||2 nodes||TBD||Restricted access|
|Hugemem||168 hours||4 nodes||1 node||1 node|
|Parallel hugemem||TBD||4 nodes||2 nodes||4 nodes||Do not support for now|
|Debug-regular||1 hour||6 nodes||1 core||2 nodes||
|Debug-GPU||1 hour||2 nodes||1 core||2 nodes||
|Soft Max Running Job limit||Hard Max Running Job Limit||Max Core Limit|
The soft and hard max limits above apply depending on different system resource availability. If resources are scarce, then the soft max limit is used to increase the fairness of allocating resources. Otherwise, if there are idle resources, then the hard max limit is used to increase system utilization.
An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use.
However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.