Changes of Default Memory Limits

Problem Description

Our current GPFS file system is a distributed process with significant interactions between the clients. As the compute nodes being GPFS flle system clients, a certain amount of memory of each node needs to be reserved for these interactions. As a result, the maximum physical memory of each node allowed to be used by users' jobs are reduced, in order to keep the healthy performance of the file system. In addition, using swap memory is not allowed anymore. 

The table below summarizes the maximum physical memory allowed for each type of nodes on our systems:

Ruby Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 64GB 61GB
Debug node 128GB 124GB
Huge memory node 1024GB (1TB) 1008GB

Owens Cluster

NODE TYPE PHYSICAL MEMORY per node MAXIMUM MEMORY ALLOWED per node
Regular node 128GB 124GB
Huge memory node 1536GB

1510GB

Pitzer Cluster

Node type physical memory per node Maximum memory allowed per Node 
Regular node 192GB 183GB
GPU node 384GB 374GB
Huge memory node 3TB 2019GB

Solutions When You Need Regular Nodes

Starting from October 27, 2016, we'll implement a new scheduling policy on all of our clusters, reflecting the reduced default memory limits. 

If you do not request memory explicitly in your job (no -l mem

Your job can be submitted and scheduled as before, and resouces will be allocated according to your requests of cores/nodes ( nodes=XX:ppn=XX ).  If you request partial node, the memory allocated to your job is proportional to the number of cores requested (4GB/core on Owens, 4761MB/core on Pitzer); if you request the whole node, the memory allocated to your job is decreased, following the information summarized in the above tables. Some examples are provided below.

A request of partial node:

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.  

On Owens, a request of   nodes=1:ppn=1    will be allocated with 4GB memory, and charged for 1 core. A request of   nodes=1:ppn=4   will be allocated with 16GB memory, and charged for 4 cores.

On Pitzer, a request of  nodes=1:ppn=1   will be allocated with 4761MB memory, and charged for 1 core.  A request of   nodes=1:ppn=4   will be allocated with 19044MB memory, and charged for 4 cores.  

 

A request of the whole node:

A request of the whole regular node will be allocated with maximum memory allowed per node and charged for the whole node, as summarized below:

  Request memory allocated charged for
Ruby nodes=1:ppn=20  61GB 20 cores
Owens nodes=1:ppn=28 124GB 28 cores
Pitzer nodes=1:ppn=40 183GB 40 cores

A request of multiple nodes:

If you have a multi-node job (  nodes>1  ), your job will be assigned the entire nodes with maximum memory allowed per node (61GB for Ruby, 124GB for Owens, and 183GB for Pitzer) and charged for the entire nodes regardless of ppn request.

If you do request memory explicitly in your job (with  -l mem 

If you request memory explicily in your scirpt, please re-visit your script according to the following information. 

A request of partial node:

On Owens, a request of  nodes=1:ppn=1, mem=4gb   will be allocated with 4GB memory, and charged for 1 core.

On Ruby, we always allocate whole nodes to jobs and charge for the whole node, with 61GB memory allocated to your job.

On Pitzer, a job that requests    nodes=1:ppn=1,mem=14283mb   will be assigned one core and have access to 14283 MB of RAM, and charged for 3 cores worth of Resource Units (RU).  However, a job that requests   nodes=1:ppn=5,mem=14283mB    will be assigned 5 cores but have access to only 14283 MB of RAM , and charged for 5 cores worth of Resource Units (RU). 

 A request of the whole node:

On Ruby, the maximum value you can use for  -l mem  is  61gb , i.e.  -l mem=61gb . A request of   nodes=1:ppn=20,mem=61gb  will be allocated with 61GB memory, and charged for the whole node. If you need more than 61GB memory for the job, please submit your job to huge memory nodes on Ruby, or switch to Owens cluster. Any request requesting  mem>61gb  will not be scheduled. 

On Owens, the maximum value you can use for -l mem is 125gb , i.e. -l mem=125gb . A request of   nodes=1:ppn=28,mem=124gb  will be allocated with 124GB memory, and charged for the whole node. If you need more than 124GB memory for the job, please submit your job to huge memory nodes, or switch to Pitzer cluster. Any request requesting  mem=>126gb  will not be scheduled. 

On Pitzer, the maximum value you can use  for   -l mem    is    183gb  ,   i.e.   -l mem=183gb  .  A request  of    nodes=1:ppn=40,mem=183gb   will   be  allocated with 183GB memory, and charged for the whole node. If you need more than 183GB memory for the job, please submit your job to huge memory nodes on Owens or Pitzer. Any request  requesting   mem>183gb   may  be re-scheduled on huge memory node on Pitzer, or will not be scheduled, depending on what you put in the request.

A request of multiple nodes:

If you have a multi-node job (   nodes>1 ), your job will be assigned the entire nodes with maximum memory allowed per node (61GB for Ruby, 124GB for Owens, and 183GB for Pitzer) and charged for the entire nodes.

Solutions When You Need Special Nodes

It is highly recommended that you do not put any memory request and follow the syntax below if you need any special resources.

Ruby Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Debug node nodes=1:ppn=16 -q debug 124GB 16 cores
Huge memory node nodes=1:ppn=32 1008GB 32 cores

Owens Cluster:

NODE TYPE HOW TO REQUEST MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=48 1510GB 48 cores

 Pitzer Cluster:

node type how to request MEMORY ALLOCATED CHARGED FOR
Huge memory node nodes=1:ppn=80,mem=3000gb 3019GB 80 cores

 

Supercomputer: