How to Prepare Slurm Job Scripts

Known Issue

The usage of combing the --ntasks and --ntask-per-node options in a job script can cause some unexpected resource allocations and placement due to a bug in Slurm 23. OSC users are strongly encouraged to review their job scripts for jobs that request both --ntasks and --ntasks-per-node. Jobs should request either --ntasks or --ntasks-per-node, not both.

As the first step, you can submit your PBS batch script as you did before to see whether it works or not. If it does not work, you can either follow this page for step-by-step instructions, or read the tables below to convert your PBS script to Slurm script by yourself. Once the job script is prepared, you can refer to this page to submit and manage your jobs.

Job Submission Options

Use Torque/Moab Slurm Equivalent
Script directive #PBS #SBATCH
Job name -N <name> --job-name=<name>
Project account -A <account> --account=<account>
Queue or partition -q queuename --partition=queuename

Wall time limit

-l walltime=hh:mm:ss --time=hh:mm:ss
Node count -l nodes=N --nodes=N
Process count per node -l ppn=M --ntasks-per-node=M
Memory limit -l mem=Xgb --mem=Xgb (it is MB by default)
Request GPUs -l nodes=N:ppn=M:gpus=G --nodes=N --ntasks-per-node=M --gpus-per-node=G
Request GPUs in default mode -l nodes=N:ppn=M:gpus=G:default

--nodes=N --ntasks-per-node=M --gpus-per-node=G --gpu_cmode=shared

Require pfsdir -l nodes=N:ppn=M:pfsdir --nodes=N --ntasks-per-node=M --gres=pfsdir
Require 'vis'  -l nodes=N:ppn=M:gpus=G:vis --nodes=N --ntasks-per-node=M --gpus-per-node=G --gres=vis

Require special property

-l nodes=N:ppn=M:property --nodes=N --ntasks-per-node=M --constraint=property

Job array

-t <array indexes> --array=<indexes>

Standard output file

-o <file path> --output=<file path>/<file name> (path must exist, and you must specify the name of the file)

Standard error file

-e <file path> --error=<file path>/<file name> (path must exist, and you must specify the name of the file)

Job dependency

-W depend=after:jobID[:jobID...]

-W depend=afterok:jobID[:jobID...]

-W depend=afternotok:jobID[:jobID...]

-W depend=afterany:jobID[:jobID...]

--dependency=after:jobID[:jobID...]

--dependency=afterok:jobID[:jobID...]

--dependency=afternotok:jobID[:jobID...]

--dependency=afterany:jobID[:jobID...]

Request event notification -m <events>

--mail-type=<events>

Note: multiple mail-type requests may be specified in a comma-separated list:

--mail-type=BEGIN,END,NONE,FAIL

Email address -M <email address> --mail-user=<email address>
Software flag -l software=pkg1+1%pkg2+4 --licenses=pkg1@osc:1,pkg2@osc:4
Require reservation -l advres=rsvid --reservation=rsvid

Job Environment Variables

Info Torque/Moab Environment Variable Slurm Equivalent
Job ID $PBS_JOBID $SLURM_JOB_ID
Job name $PBS_JOBNAME $SLURM_JOB_NAME
Queue name $PBS_QUEUE $SLURM_JOB_PARTITION
Submit directory $PBS_O_WORKDIR $SLURM_SUBMIT_DIR
Node file cat $PBS_NODEFILE srun hostname |sort -n
Number of processes $PBS_NP $SLURM_NTASKS
Number of nodes allocated $PBS_NUM_NODES $SLURM_JOB_NUM_NODES
Number of processes per node $PBS_NUM_PPN $SLURM_TASKS_PER_NODE
Walltime $PBS_WALLTIME $SLURM_TIME_LIMIT
Job array ID $PBS_ARRAYID $SLURM_ARRAY_JOB_ID
Job array index $PBS_ARRAY_INDEX $SLURM_ARRAY_TASK_ID

Environment Variables Specific to OSC

Environment variable Description
$TMPDIR Path to a node-specific temporary directory (/tmp) for a given job
$PFSDIR Path to the scratch storage; only present if --gres request includes pfsdir.
$SLURM_GPUS_ON_NODE Number of GPUs allocated to the job on each node (works with --exclusive jobs)
$SLURM_JOB_GRES The job's GRES request
$SLURM_JOB_CONSTRAINT The job's constraint request
$SLURM_TIME_LIMIT Job walltime in seconds

Commands in a Batch Job

Use Torque/Moab Environment Variable Slurm Equivalent
Launch a parallel program inside a job mpiexec <args> srun <args>
Scatter a file to node-local file systems pbsdcp <file> <nodelocaldir>

sbcast <src_file> <nodelocaldir>/<dest_file>

* Note: sbcast does not have a recursive cast option, meaning you can't use sbcast -r to scatter multiple files in a directory. Instead, you may use a loop command similar to this:

cd ${the directory that has the files}

for FILE in * 
do
    sbcast -p $FILE $TMPDIR/some_directory/$FILE
done
Gather node-local files to a shared file system pbsdcp -g <file> <shareddir>

sgather <src_file> <shareddir>/<dest_file>
 sgather -r <src_dir> <sharedir>/dest_dir>

Supercomputer: