Batch Limit Rules

Memory Limit:

A small portion of the total physical memory on each node is reserved for distributed processes. The actual physical memory available to user jobs is tabulated below.

Summary

Node type	default and max memory per core	max memory per node
regular compute	4.214 GB	117 GB
huge memory	31.104 GB	1492 GB
gpu	4.214 GB	117 GB

A job may request more than the max memory per core, but the job will be allocated more cores to satisfy the memory request instead of just more memory.
e.g. The following slurm directives will actually grant this job 3 cores, with 10 GB of memory
(since 2 cores * 4.2 GB = 8.4 GB doesn't satisfy the memory request).

#SBATCH --ntasks-per-node=2

 #SBATCH --mem=10g

It is recommended to let the default memory apply unless more control over memory is needed.
Note that if an entire node is requested, then the job is automatically granted the entire node's main memory. On the other hand, if a partial node is requested, then memory is granted based on the default memory per core.

See a more detailed explanation below.

Regular Dense Compute Node

On Owens, it equates to 4,315 MB/core or 120,820 MB/node (117.98 GB/node) for the regular dense compute node.

If your job requests less than a full node ( ntasks-per-node < 28 ), it may be scheduled on a node with other running jobs. In this case, your job is entitled to a memory allocation proportional to the number of cores requested (4315 MB/core). For example, without any memory request ( mem=XXMB ), a job that requests --nodes=1 --ntasks-per-node=1 will be assigned one core and should use no more than 4315 MB of RAM, a job that requests --nodes=1 --ntasks-per-node=3 will be assigned 3 cores and should use no more than 3*4315 MB of RAM, and a job that requests --nodes=1 --ntasks-per-node=28 will be assigned the whole node (28 cores) with 118 GB of RAM.

Here is some information when you include memory request (mem=XX ) in your job. A job that requests --nodes=1 --ntasks-per-node=1 --mem=12GB will be assigned three cores and have access to 12 GB of RAM, and charged for 3 cores worth of usage (in other ways, the request --ntasks-per-node is ingored). A job that requests --nodes=1 --ntasks-per-node=5 --mem=12GB will be assigned 5 cores but have access to only 12 GB of RAM, and charged for 5 cores worth of usage.

A multi-node job ( nodes>1 ) will be assigned the entire nodes with 118 GB/node and charged for the entire nodes regardless of ppn request. For example, a job that requests --nodes=10 --ntasks-per-node=1 will be charged for 10 whole nodes (28 cores/node*10 nodes, which is 280 cores worth of usage).

Huge Memory Node

Beginning on Tuesday, March 10th, 2020, users are able to run jobs using less than a full huge memory node. Please read the following instructions carefully before requesting a huge memory node on Owens.

On Owens, it equates to 31,850 MB/core or 1,528,800 MB/node (1,492.96 GB/node) for a huge memory node.

To request no more than a full huge memory node, you have two options:

The first is to specify the memory request between 120,832 MB (118 GB) and 1,528,800 MB (1,492.96 GB), i.e., 120832MB <= mem <=1528800MB ( 118GB <= mem < 1493GB). Note: you can only use interger for request
The other option is to use the combination of --ntasks-per-node and --partition, like --ntasks-per-node=4 --partition=hugemem . When no memory is specified for the huge memory node, your job is entitled to a memory allocation proportional to the number of cores requested (31,850MB/core). Note, --ntasks-per-node should be no less than 4 and no more than 48.

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

GPU Jobs

There is only one GPU per GPU node on Owens.

For serial jobs, we allow node sharing on GPU nodes so a job may request any number of cores (up to 28)

(--nodes=1 --ntasks-per-node=XX --gpus-per-node=1)

For parallel jobs (n>1), we do not allow node sharing.

See this GPU computing page for more information.

Partition time and job size limits

Here are the partitions available on Owens:

Name	Max time limit (dd-hh:mm:ss)	Min job size	Max job size	notes
serial	7-00:00:00	1 core	1 node
longserial	14-00:00:0	1 core	1 node	Restricted access (contact OSC Help if you need access)
parallel	4-00:00:00	2 nodes	81 nodes
gpuserial	7-00:00:00	1 core	1 node
gpuparallel	4-00:00:00	2 nodes	8 nodes
hugemem	7-00:00:00	1 core	1 node
hugemem-parallel	4-00:00:00	2 nodes	16 nodes	Restricted access (contact OSC Help if you need access)
debug	1:00:00	1 core	2 nodes	For small interactive and test jobs
gpudebug	1:00:00	1 core	2 nodes	For small interactive and test GPU jobs

To specify a partition for a job, either add the flag --partition=<partition-name> to the sbatch command at submission time or add this line to the job script:
#SBATCH --paritition=<partition-name>

To access one of the restricted queues, please contact OSC Help. Generally, access will only be granted to these queues if the performance of the job cannot be improved, and job size cannot be reduced by splitting or checkpointing the job.

Job/Core Limits

	Max Running Job Limit				Max Core/Processor Limit	Max node Limit
	For all types	GPU jobs	Regular debug jobs	GPU debug jobs	For all types	hugemem
Individual User	384	132	4	4	3080	12
Project/Group	576	132	n/a	n/a	3080	12

An individual user can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

However, among all the users in a particular group/project, they can have up to the max concurrently running jobs and/or up to the max processors/cores in use.

A user may have no more than 1000 jobs submitted to both the parallel and serial job queue separately.

Supercomputer:

Owens

Service:

HPC