Platform R: Slurm Introduction

The Platform R high performance computing cluster is made up of compute nodes which have a certain amount of memory, CPUs, and sometimes GPUs. To use these resources, a user submits jobs to Slurm that describe the application they want to run, along with a definition of all the computing resources they need to run that application successfully.

Running Batch Jobs

For the majority of users a Slurm batch job will be started with a shell script which does three things:

Describes the resources required to run the job in the Slurm header, including memory, CPUs, GPUs, and time.
Optionally sets up the environment in which the jobs will run, such as activating a conda or spack environment.
Specifies the work to be carried out with shell commands.

Here is a very simple but complete sbatch script. The different sections are described below.

#!/bin/bash
#SBATCH --job-name=mycalc
#SBATCH --time=10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --memory=2G
#SBATCH --output=mycalc-%j.out

# Set up program environment.
conda activate mycalc

# Run the program.
srun python3 calc.py

Slurm Header

A Slurm batch file looks like a Unix shell script. It has to start with the normal header #!/bin/bash (though other shells could be used). If you want the Slurm batch script to include your entire shell environment, use the -l option at the end of the header line. Most of the time it will be easier to use a clean environment, and use just #!/bin/bash with no -l.

After the shell are #SBATCH directives. These have to appear at the start of the line. Blank lines are permitted between #SBATCH lines, but once any other sort of shell command occurs, #SBATCH processing stops.

No command in the batch script (including srun) can use more resources than what is defined by the sbatch options.

Allocation Options

There are many options for Slurm jobs. Below are a few of the most commonly used. There are long and short versions of most options.

Slurm Resource Options
Option	Meaning
`-N` / `--nodes`	number of nodes to use
`-n` / `--ntasks`	number of tasks to run
`-p` / `--partition`	partition to use, either `cpu` or `gpu`; the default is `cpu`
`-t` / `--time`	execution time limit; the Platform R default is two hours
`-J` / `--job-name`	name of the job as it will appear in `squeue` output; the default is the name of the batch script
`-c` / `--cpus-per-task`	how many CPUs each task needs; the default is one
`--mem-per-cpu`	how much memory is needed for each allocated CPU

Times for the --time option are specified in the format days-hours:minutes:seconds. You can omit the larger units, so that --time=20:00 can be used for a time limit of 20 minutes.

Batch Job Output

Unless you specify otherwise, any output and error messages from your Slurm job will go into a file called slurm-<JobID>.out, in the same directory you submitted the job from. This captures normal output and error messages ("standard out" and "standard error" in POSIX terminology). You can separate these with batch options.

Output Batch Options
Option	Meaning
`-o` / `--output`	name of file for standard output
`-e` / `--error`	name of file for standard error
`-i` / `--input`	name of the file for standard input

The file names can have format characters which put job information in the file name. Each of these starts with a percent character, %.

Format Characters
Format Character	Meaning
`%u`	user name
`%x`	job name set with the `--job-name` batch option
`%j`	JobID
`%s`	StepID
`%J`	JobID.StepID (something like "876343.2")

Using the example at the top, --output=mycalc-%j.out will generate an output file resembling mycalc-289743.out.

See the Slurm documentation on filename patterns for the full list of format characters.

Set up Environment

After the #SBATCH options are any shell commands needed to set up the environment for your jobs to run. Conda or spack activation should happen here, as well as setting any environment variables.

Specify Work

You do not have to start everything with srun, but rescheduling preempted jobs works better if you do. Also see below in the srun section.

Run the Batch Job

Slurm batch scripts do not have any naming requirements, but ending them with a batch extension can be useful for project organization.

To submit your Slurm batch request, use: sbatch myjob.batch

When the job is successfully submitted sbatch will print out: Submitted batch job <JobID>. You can use the JobID to get information about the status of the job. The command squeue --me will list your currently submitted jobs, and you can also get the JobID from that.

You can put #SBATCH options on the command line, which override the #SBATCH lines in the batch file. For example, sbatch --cpus-per-task=2 myjob.batch would ask for two CPUs for the submission, overriding whatever is in the batch file. The order of precedence for sbatch options is:

on the command line, such as sbatch --cpus-per-task=2
in environment variables, such as if you had this set in your shell startup: setenv SLURM_CPUS_PER_TASK=2 (the sbatch manual page has a list of available variables)
the #SBATCH options in the batch file

Srun

The srun command has two modes of operation. First, it can be used to launch interactive processes on the terminal. Second, it is used to create job steps from a batch script.

The sbatch and srun commands share most resource allocation options.

From the Command Line

Using srun on the command line is best for interactive testing. Longer jobs that don't need to be attended should be run in batch mode.

The remote tasks started by srun from the command line can be interactive or not. For example, to simply run a python script on another node,

srun --ntasks=1 --time=1:00:00 ./compute.py

You can also use it to start an interactive command on another node. This command starts an interactive shell:

srun --ntasks=1 --time=1:00:00 -pty /bin/bash -l

It is important to specify a reasonable time for interactive srun sessions, so that it doesn't sit idle for long times. Even if you are not actively using a session it has still allocated resources that others could be using.

From a Batch Script

Within a sbatch script, srun starts tasks, which run on a single node.

It is not required to start your compute code within sbatch using srun. However, it is better to do so for several reasons.

The slurm scheduler can manage the work better, including job preemption, which is handled more gracefully with srun.
Information about your job's resource use is captured better using srun, which will make it easier for you to optimize your batch resource requests.
It makes parallel job start-up easier to manage.

The total number of tasks in the job step (srun -n flags) must equal the batch script --ntasks value.

Observing Jobs

The state of submitted jobs is queried with the squeue command. By default it will print out information about all queued and running jobs. Some useful options are:

Squeue Command Line Options
Option	Meaning
`--me`	only report on jobs submitted by me
`--state=PENDING`	show jobs that have not yet been started
`--state=RUNNING`	show jobs that are currently running
`--long`	report information without using abbreviations
`--start`	print an estimated start time for jobs that are not yet running

You can get comprehensive details about a single job with the command scontrol show jobid <JobID>

Canceling Jobs

The command scancel is used to cancel jobs.

Scancel Command Line Options
Command	Meaning
`scancel <JobID>`	stops the JobID
`scancel --me`	cancels all jobs owned by me
`scancel --me --state PENDING`	cancels all my jobs that have not yet started running; see Platform R: Debugging Slurm Jobs for other common job states
`scancel --name <jobname>`	cancels all jobs with the given name