The Ohio Supercomputer Center (OSC) is experiencing an email delivery problem with several types of messages from MyOSC. 

 OSC is preparing to update Slurm on its production systems to version 23.11.4 on March, 27. 

Guidance on Requesting Resources on Pitzer

In late 2018, OSC installed 260 Intel® Xeon® 'Skylake' processor-based nodes as the original Pitzer cluster. In September 2020, OSC installed additional 398 Intel® Xeon® 'Cascade Lake' processor-based nodes as part of a Pitzer Expansion cluster. This expansion makes Pitzer a heterogeneous cluster, which means that the jobs may land on different types of CPU and behaves differently if the user submits the same job script repeatedly to Pitzer but does not request the resources properly. This document provides you some general guidance on how to request resources on Pitzer due to this heterogeneous nature. 

Step 1: Identify your job type

  Nodes the job may be allocated on # of cores per node Usable Memory GPU
Jobs requesting standard compute node(s) Dual Intel Xeon 6148s Skylake @2.4GHz 40 

178 GB memory/node

4556 MB memory/core

N/A
Dual Intel Xeon 8268s Cascade Lakes @2.9GHz 48

178 GB memory/node

3797 MB memory/core

N/A
Jobs requesting dual GPU node(s)

Dual Intel Xeon 6148s Skylake @2.4GHz

40

363 GB memory/node

9292 MB memory/core

2 NVIDIA Volta V100 w/ 16GB GPU memory
Dual Intel Xeon 8268s Cascade Lakes @2.9GHz 48

363 GB memory/node

7744 MB memory/core

2 NVIDIA Volta V100 w/32GB GPU memory
Jobs requesting quad GPU node(s) Dual Intel Xeon 8260s Cascade Lakes @2.4GHz 48

744 GB memory/node

15872 MB memory/core

4 NVIDIA Volta V100s w/32GB GPU memory and NVLink
Jobs requesting large memory node(s) Dual Intel Xeon 8268s Cascade Lakes @2.9GHz 48

744 GB memory/node

15872 MB memory/core

N/A
Jobs requesting huge memory node(s) Quad Processor Intel Xeon 6148 Skylakes @2.4GHz 80

2989 GB memory/node

38259 MB memory/core

N/A

According to this table,

  • If your job requests standard compute node(s) or dual GPU node(s), it can potentially land on different types of nodes and may result in different job performance. Please follow the steps below to determine whether you would like to restrain your job to a certain type of node(s). 
  • If your job requests quad GPU node(s), large memory node(s), or huge memory node(s), please check pitzer batch limit rules on how to request these special types of resources properly. 

Step 2: Perform test

This step is to submit your jobs requesting the same resources to different types of nodes on Pitzer. For your job script is prepared with either PBS syntax or Slurm syntax:

Request 40 or 48 core nodes

#SBATCH --constraint=40core
#SBATCH --constraint=48core

Request 16gb, 32gb gpu

#SBATCH --constraint=v100
#SBATCH --constraint=v100-32g --partition=gpuserial-48core

 

Once the script is ready, submit your jobs to Pitzer and wait till the jobs are completed. 

Step 3: Compare the results

Once the jobs are completed, you can compare the job performances in terms of core-hours, gpu-hours, walltime, etc. to determine how your job is sensitive to the type of the nodes. If you would like to restrain your job to land on a certain type of nodes based on the testing, you can add  #SBATCH --constraint=. The disadvantage of this is that you may have a longer queue wait time on the system. If you would like to have your jobs scheduled as fast as possible and do not care which type of nodes your job will land on, do not include the constraint in the job request. 

Supercomputer: