Slurm Migration Issues

This page documents the known issues for migrating jobs from Torque to Slurm.

$PBS_NODEFILE and $SLURM_JOB_NODELIST

Please be aware that $PBS_NODEFILE is a file while $SLURM_JOB_NODELIST is a string variable. 

The analog on Slurm to cat $PBS_NODEFILE is srun hostname | sort -n 

Environmental variables can't be passed in the job script

The job script job.txt including  #SBATCH --output=$HOME/jobtest.out won't work in Slurm. Please use the following instead:

sbatch --output=$HOME/jobtest.out job.txt 

Using mpiexec with Intel MPI

Intel MPI on Pitzer is configured to support PMI and Hydra process managers. It is recommended to use srun as MPI program launcher. If you prefer using mpiexec on Pitzer, you might experience MPI init error or see the message:

MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found

Please set unset I_MPI_PMI_LIBRARY in job before running MPI programs to resolve the issue.

Using --ntasks-per-node and --mem options together

Right now jobs using --ntasks-per-node and --mem are running into a bug where if --mem divided by MaxMemPerCPU is greater than ntasks-per-node, the job is not seen as schedulable on the partition where the MaxMemPerCPU issue exists.  The MaxMemPerCPU value is set on all partitions to be usable memory divided by cores a given type of node has.  One issue we’ve observed is jobs questing 1 GPU with --ntasks-per-node=4 and --mem=32G are incorrectly only running on the quad GPU nodes even when other GPU nodes are available.

Executables with a certain MPI library using SLURM PMI2 interface

e.g.

Stopping mpi4py python processes during an interactive job session only from a login node:

$ salloc -t 15:00 --ntasks-per-node=4
salloc: Pending job allocation 20822
salloc: job 20822 queued and waiting for resources
salloc: job 20822 has been allocated resources
salloc: Granted job allocation 20822
salloc: Waiting for resource configuration
salloc: Nodes p0511 are ready for job
# don't login to one of the allocated nodes, stay on the login node
$ module load python/3.7-2019.10
$ source activate testing
(testing) $ srun --quit-on-interrupt python mpi4py-test.py
# enter <ctrl-c>
^Csrun: sending Ctrl-C to job 20822.5
Hello World (from process 0)
process 0 is sleeping...
Hello World (from process 2)
process 2 is sleeping...
Hello World (from process 3)
process 3 is sleeping...
Hello World (from process 1)
process 1 is sleeping...
Traceback (most recent call last):
File "mpi4py-test.py", line 16, in <module>
time.sleep(15)
KeyboardInterrupt
Traceback (most recent call last):
File "mpi4py-test.py", line 16, in <module>
time.sleep(15)
KeyboardInterrupt
Traceback (most recent call last):
File "mpi4py-test.py", line 16, in <module>
time.sleep(15)
KeyboardInterrupt
Traceback (most recent call last):
File "mpi4py-test.py", line 16, in <module>
time.sleep(15)
KeyboardInterrupt
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** STEP 20822.5 ON p0511 CANCELLED AT 2020-09-04T10:13:44 ***
# still in the job and able to restart the processes
(testing)

pbsdcp with Slurm

pbsdcp works correctly with Slurm. But, when you use the wildcard, it should be without quotation marks. In Torque/Moab, you can use it, for example

pbsdcp -g '*' {dest_dir} 

But, with Slurm, it should be without quotation marks:

pbsdcp -g * {dest_dir} 

If you like, you can use sbcast and/or sgather instead of pbsdcp as well.

Please submit any issue using the webform below:

 

 
 
1 Start 2 Complete

Please report the problem here when you use Slurm

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Supercomputer: