It is strongly suggested to consider the memory use to the available per-core memory when users request OSC resources for their jobs.
Regular Compute Node
For regular compute node, the physical memory equates to 4.8 GB/core or 192 GB/node; while the usable memory equates to 4556 MB/core or 178 GB/node. See Changes of Default Memory Limits for more discussions.
If your job requests less than a full node (
ppn < 40 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4556 MB/core). For example, without any memory request (
mem=XX ), a job that requests
nodes=1:ppn=1 will be assigned one core and should use no more than 4556 MB of RAM, a job that requests
nodes=1:ppn=3 will be assigned 3 cores and should use no more than 3*4556MB of RAM, and a job that requests
nodes=1:ppn=40 will be assigned the whole node (40 cores).
Please be careful if you include memory request (
mem=XX ) in your job. A job that requests
nodes=1:ppn=1,mem=14283mb will be assigned one core and have access to 14283 MB of RAM, and charged for 4 cores worth of usage. However, a job that requests
nodes=1:ppn=5,mem=14283mB will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of usage.
A multi-node job (
nodes > 1 ) will be assigned the entire nodes and charged for the entire nodes regardless of ppn request. For example, a job that requests
nodes=10:ppn=1 will be charged for 10 whole nodes (40 cores/node*10 nodes, which is 400 cores worth of usage).
For GPU node, the physical memory equates to 9.6 GB/core or 384 GB/node; while the memory used by the submit filter equates to 4556 MB/core or 363 GB/node.
For any job that requests more than 178 GB/node but no more than 363GB/node, the job will be scheduled on the GPU node (which is called 'largemem' queue).
Huge Memory Node
On Pitzer, it has 80 cores per node. The physical memory equates to 37.5 GB/core or 3 TB/node; while the memory used by the submit filter equates to 4556 MB/core or 2989 GB/node.
Please always specify a memory limit in your job if your job requests a huge memory node; otherwise, we will allocate 4556MB/core only for your job. For example, a job that requests
nodes=1:ppn=60,mem=2250GB) will be allocated 60 cores with 2250 GB of RAM and charged for 60 cores worth of usage. A job that requests huge-memory node (
nodes=1:ppn=80,mem=2989GB) will be allocated the entire huge-memory node with 2989 GB of RAM and charged for the whole node.
In summary, for serial jobs, we will allocate the resources considering both the ppn and memory request. For parallel jobs (n>1), we will allocate the entire nodes with the whole memory regardless of ppn request. Below is the summary of the physical and usable memory of different types of nodes on Pitzer. To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.
|Type of node||Physical Memory||Usable Memory|
|Regular compute||Per core||4.8 GB||4556 MB|
|Per node||192 GB (40 cores)||178 GB|
|GPU||Per core||9.6 GB||4556 MB|
|Per node||384 GB (40 cores)||363 GB|
|Huge memory||Per core||37.5 GB||4556 GB|
|Per node||3 TB (80 cores)||2989 GB|
There are 2 GPUs per node on Pitzer.
For serial jobs, we will allow node sharing on GPU nodes so a job may request any number of cores (up to 40) and either 1 or 2 GPUs (
nodes=1:ppn=XX: gpus=1 or gpus=2 )
For parallel jobs (n>1), we will not allow node sharing. A job may request 1 or 2 GPUs (
gpus=1 or gpus=2 ) but both GPUs will be allocated to the job.
Here are the queues available on Pitzer:
|Name||Max walltime||nodes available||min job size||max job size||notes|
|Serial||168 hours||Available minus reservations||1 core||1 node|
|Longserial||336 hours||Available minus reservations||1 core||1 node||Restricted access|
|Parallel||96 hours||Available minus reservations||2 nodes||40 nodes|
|Longparallel||TBD||Available minus reservations||2 nodes||TBD||Restricted access|
|Hugemem||168 hours||4 nodes||1 core||1 node|
|Parallel hugemem||TBD||4 nodes||2 nodes||4 nodes||Do not support for now|
|Debug-regular||1 hour||6 nodes||1 core||2 nodes||
|Soft Max Running Job limit||Hard Max Running Job Limit||Max Core Limit|
The soft and hard max limits above apply depending on different system resource availability. If resources are scarce, then the soft max limit is used to increase the fairness of allocating resources. Otherwise, if there are idle resources, then the hard max limit is used to increase system utilization.
An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use.
However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.