Batch Limit Rules

Full Node Charging Policy

On Ruby, we always allocate whole nodes to jobs and charge for the whole node. If a job requests less than a full node (nodes=1:ppn<20), the job execution environment is what is requested (the job only has access to the # of cores according to ppn request) with 64GB of RAM; however, the job will be allocated whole node and charge for the whole node. A job that requests nodes>1 will be assigned the entire nodes with 64GB/node and charged for the entire nodes regardless of ppn request.  A job that requests huge-memory node (nodes=1:ppn=32) will be allocated the entire huge-memory node with 1TB of RAM and charged for the whole node (32 cores worth of RU).

To manage and monitor your memory usage, please refer to Out-of-Memory (OOM) or Excessive Memory Usage.

Queue Default

Please keep in mind that if you submits a job with no node specification, the default is nodes=1:ppn=20, while if you submits a job with no ppn specified, the default is nodes=N:ppn=1

Debug Node

Ruby has 4 debug nodes which are specifically configured for short (< 1 hour) debugging type work. These nodes have a walltime limit of 1 hour. These nodes, consisting of 2 non-GPU nodes and 2 GPU nodes (with 2 GPUs per node), are equipped with E5-2670 V1 CPUs with 16 cores per a node.

  • To schedule a non-GPU debug node: nodes=1:ppn=16 -q debug
  • To schedule two non-GPU debug nodes: nodes=2:ppn=16 -q debug
  • To schedule a GPU debug node: nodes=1:ppn=16:gpus=2 -q debug
  • To schedule two GPU debug nodes: nodes=2:ppn=16:gpus=2 -q debug

GPU Node

On Ruby, 20 nodes are equipped with NVIDIA Tesla K40 GPUs (one GPU with each node).  These nodes can be requested by adding gpus=1 to your nodes request (nodes=1:ppn=20:gpus=1). 

Walltime Limit

Here are the queues available on Ruby:

NAME

MAX WALLTIME

MAX JOB SIZE

NOTES

Serial

168 hours

1 node

 

Parallel

96 hours

40 nodes

 

Hugemem

48 hours

1 node

32 core with 1 TB RAM

Debug

1 hour

2 nodes (either GPU or non-GPU)

16 core with 128GB RAM

Job Limit

An individual user can have up to 40 concurrently running jobs and/or up to 800 processors/cores in use. All the users in a particular group/project can among them have up to 80 concurrently running jobs and/or up to 1600 processors/cores in use if the system is busy. Debug queue is 1 job at a time per user. For Condo users, please contact OSC Help for more instructions.