Job scripts are submitted to the batch system using the
sbatch command. Be sure to submit your job on the system you want your job to run on, or use the
--cluster=<system> option to specify one.
Standard batch job
Most jobs on our system are submitted as scripts with no command-line options. If your script is in a file named
In response to this command you’ll see a line with your job ID:
Submitted batch job
You’ll use this job ID (numeric part only) in monitoring your job. You can find it again using the
squeue -u <username>
When you submit a job, the script is copied by the batch system. Any changes you make subsequently to the script file will not affect the job. Your input files and executables, on the other hand, are not picked up until the job starts running.
The batch system supports an interactive batch mode. This mode is useful for debugging parallel programs or running a GUI program that’s too large for the login node. The resource limits (memory, CPU) for an interactive batch job are the same as the standard batch limits.
Interactive batch jobs are generally invoked without a script file.
Custom sinteractive command
OSC has developed a script to make starting an interactive session simpler.
The sinteractive command takes simple options and starts an interactive batch session automatically. However, its behavior can be counterintuitive with respect to numbers of tasks and CPUs. In addition, jobs launched with sinteractive can show environmental differences compared to jobs launched via other means. As an alternative, try, e.g.:
salloc -A <proj-code> --time=500 /bin/bash
The example below demonstrates using sinteractive to start a serial interactive job:
sinteractive -A <proj-code>
The default if no resource options are specified is for a single core job to be submitted.
Simple parallel (single node)
To request a simple parallel job of 4 cores on a single node:
sinteractive -A <proj-code> -c 4
To setup for OpenMP executables then enter this command:
Parallel (multiple nodes)
To request 2 whole nodes on Pitzer with a total of 96 cores between both nodes:
sinteractive -A <proj-code> -N 2 -n 96
But note that the slurm variables SLURM_CPUS_PER_TASK, SLURM_NTASKS, and SLURM_TASKS_PER_NODE are all 1, so subsequent srun commands to launch parallel executables must explicitly specify the task and cpu numbers desired. Unless one really needs to run in the debug queues it is in general simpler to start with an appropriate salloc command.
sinteractive --helpto view all the options available and their default values.
Using salloc and srun
An example of using salloc and srun:
salloc --account=pas1234 --x11 --nodes=2 --ntasks-per-node=28 --time=1:00:00srun --pty /bin/bash
salloc command requests the resources, then
srun starts an interactive shell for the requested resources. Job is interactive. The
--x11 flag enables X11 forwarding, which is necessary with a GUI. You will need to have a X11 server running on your computer to use X11 forwarding, see the getting connected page. The remaining flags in this example are resource requests with the same meaning as the corresponding header lines in a batch file.
After you enter this line, you’ll see something like the following:
salloc: Pending job allocation 123456 salloc: job 123456 queued and waiting for resources
Your job will be queued just like any job. When the job runs, you’ll see the following line:
salloc: job 123456 has been allocated resources salloc: Granted job allocation 123456 salloc: Waiting for resource configuration salloc: Nodes o0001 are ready for job
At this point, you have an interactive login shell on one of the compute nodes, which you can treat like any other login shell.
It is important to remember that OSC systems are optimized for batch processing, not interactive computing. If the system load is high, your job may wait for hours in the queue, making interactive batch impractical. Requesting a walltime limit of one hour or less is recommended because your job can run on nodes reserved for debugging.
If you submit many similar jobs at the same time, you should consider using a job array. With a single
sbatch command, you can submit multiple jobs that will use the same script. Each job has a unique identifier,
$SLURM_ARRAY_TASK_ID, which can be used to parameterize its behavior.
Individual jobs in a job array are scheduled independently, but some job management tasks can be performed on the entire array.
To submit an array of jobs numbered from 1 to 100, all using the script
sbatch --array=1-100 sim.job
The script would use the environment variable
$SLURM_ARRAY_TASK_ID, possibly as an input argument to an application or as part of a file name.
It is possible to set conditions on when a job can start. The most common of these is a dependency relationship between jobs.
For example, to ensure that the job being submitted (with script
sim.job) does not start until after job 123456 has finished:
sbatch --dependency=afterany:123456 sim.job
It is possible to provide a list of environment variables that are exported to the job.
For example, to pass the variable and its value to the job with the script
sim.job, use the command:
sbatch --export=var=value sim.job
Many other options are available, some quite complicated; for more information, see the
sbatch online manual by using the command: