It is strongly suggested to consider the available per-core memory when users request OSC resources for their jobs.
|Node type||default memory per core (gb)||max usable memory (gb)|
It is recommended to let the default memory apply unless more control over memory is needed.
Note that if an entire node is requested, then the job is automatically granted the entire node's main memory. On the other hand, if a partial node is requested, then memory is granted based on the default memory per core.
See a more detailed explanation below.
Regular Dense Compute Node
On Owens, it equates to 4,315MB/core or 120,820 MB/node (117.98GB/node) for the regular dense compute node.
If your job requests less than a full node (
ntasks< 28 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4315MB/core). For example, without any memory request (
mem=XXMB ), a job that requests
--nodes=1 --ntasks=1 will be assigned one core and should use no more than 4315MB of RAM, a job that requests
--nodes=1 --ntasks=3 will be assigned 3 cores and should use no more than 3*4315MB of RAM, and a job that requests
--nodes=1 --ntasks=28 will be assigned the whole node (28 cores) with 118GB of RAM.
Here is some information when you include memory request (
mem=XX ) in your job. A job that requests
--nodes=1 --ntasks=1 --mem=12GB will be assigned three cores and have access to 12GB of RAM, and charged for 3 cores worth of usage (in other ways, the request
--ntasks is ingored). A job that requests
--nodes=1 --ntasks=5 --mem=12GB will be assigned 5 cores but have access to only 12GB of RAM, and charged for 5 cores worth of usage.
A multi-node job (
nodes>1 ) will be assigned the entire nodes with 118 GB/node and charged for the entire nodes regardless of ppn request. For example, a job that requests
--nodes=10 --ntasks-per-node=1 will be charged for 10 whole nodes (28 cores/node*10 nodes, which is 280 cores worth of usage).
Huge Memory Node
On Owens, it equates to 31,850MB/core or 1,528,800MB/node (1,492.96GB/node) for a huge memory node.
To request no more than a full huge memory node, you have two options:
- The first is to specify the memory request between 120,832MB (118GB) and 1,528,800MB (1,492.96GB), i.e.,
120832MB <= mem <=1528800MB(
118GB <= mem < 1493GB). Note: you can only use interger for request
- The other option is to use the combination of
--ntasks-per-node=4 --partition=hugemem. When no memory is specified for the huge memory node, your job is entitled to a memory allocation proportional to the number of cores requested (31,850MB/core). Note,
--ntasks-per-nodeshould be no less than 4 and no more than 48.
To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.
There is only one GPU per GPU node on Owens.
For serial jobs, we allow node sharing on GPU nodes so a job may request any number of cores (up to 28)
--nodes=1 --ntasks=XX --gpus-per-node=1)
For parallel jobs (n>1), we do not allow node sharing.
See this GPU computing page for more information.
Partition time and job size limits
Here are the partitions available on Owens:
|Name||Max time limit
|Min job size||Max job size||notes|
|gpuserial||7-00:00:00||1 core||1 node|
|gpuparallel||4-00:00:00||2 nodes||8 nodes|
|hugemem-parallel||4-00:00:00||2 nodes||16 nodes||
|debug||1:00:00||1 core||2 nodes||
|gpudebug||1:00:00||1 core||2 nodes||
--partition=<partition-name>to the sbatch command at submission time or add this line to the job script:
To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if the performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.
|Max Running Job Limit||Max Core/Processor Limit|
|For all types||GPU jobs||Regular debug jobs||GPU debug jobs||For all types|
An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use.
However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.