Great Lakes SLURM

Using the Great Lakes cluster and batch computing with SLURM

Resources

Slides: Introduction to the Great Lakes cluster and batch computing with SLURM
Slides: Advanced batch computing with SLURM on the Great Lakes cluster
Slides: MPI profiling with Allinea MAP
ARC-TS: Great Lakes overview
ARC-TS: Great Lakes Cheat Sheet
ARC-TS: SLURM user guide
ARC-TS: Migrating from PBS-Torque to SLURM
ARC-TS: Globus high-speed data transfer
Kelly's example of using Snakemake on HPC
Snakemake profile for SLURM
conda on the cluster

SLURM Basics

Commands

Command	Action
`sbatch script.sh`	Submit `script.sh` as a job
`squeue -j jobid`	Check job status by `jobid`
`squeue -u uniqname`	Check job status by user's `uniqname`
`scancel jobid`	Kill a job by `jobid`
`my_usage uniqname`	List resource usage for user `uniqname`
`sinfo`	Show node status by partition
`scontrol show node node_name`	Show details for a node `node_name`
`scontrol show job jobid`	Show details for a job by `jobid`
`srun --pty --nodes=1 --cpus-per-task=4 --time=30:00 --account=training /bin/bash`	Run an interactive job
`seff jobid`	Show total time and memory usage for job (plus other things)

Note that everything on Great Lakes will be on-demand. For memory, CPU, & GPU, you will be charged for the resources you ask for. However, you will only be charged for the walltime your jobs use.

Options

Take a look at the SLURM user guide from ARC-TS for a list of available options. Also see this guide for migrating your PBS-torque scripts to SLURM. These options go in your submission scripts (example). All lines with SLURM options start with #SBATCH. With the exception of the hashbang (#!), anything else starting with # is a comment.

Examples

More example files are in /scratch/data/workshops/IntroGreatLakes/ on the beta login node (beta-login.stage.arc-ts.umich.edu).

R scripts

examples/simpleR/

Edit your R script, Rbatch.R, with your preferred text editor.
Edit the submission script, Rbatch.sh.
Load R and submit the job.
```
module load R
sbatch Rbatch.sh
```
It will tell you the jobid in a message: Submitted batch job 32965.
Check on the status of your jobs.
```
squeue -u uniqname
```
When it finishes, take a look at the output from R.
```
less Rbatch.out
```
To troubleshoot problems, look at the SLURM log file.
```
less slurm-32965.out
```
where 32965 is the jobid.

Job Arrays

examples/arrayjob/

The matlab script arr.m takes a job id as input and works on only one task.

The submission script submit.sh sets up the job array with three tasks and runs the matlab script once per task. To make a job array, use the sbatch command #SBATCH --array=1-3. Edit the integers 1 and 3 to modify the number of tasks in the array and the numbers they're assigned.

Submit the job with:

module load matlab
sbatch submit.sh

Dependent scheduling

examples/depjob/

Submit a job at a given time:

1 minute before New Year's Day 2020:
```
 sbatch --begin 2019-12-31T23:59:00 j1.sbat
```
At the next 6pm:
```
 sbatch --begin 18:00 j2.sbat
```

Submit a job after another job completes:

 JOBID=`sbatch --parsable first.sbat`   # JOBID <- first’s jobid
 sbatch --dependency=afterany:$JOBID second.sbat

Workflow

An example using Trinity RNA-seq: examples/trinity/

The submission script trinity.sbat contains lots of boilerplate code to handle intermediate directories & files. If you find yourself writing complicated bash scripts like this, consider whether you should instead use a proper workflow manager such as Snakemake. See a minimal example of using Snakemake on the HPC.

Multiprocessing

examples/multiR/

Conda

Rather than using the modules provided, I prefer to use conda to manage my software dependencies.

Download the latest installer for Anaconda (includes everything) or Miniconda (includes only the minimum, but faster to install).

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Run the installer:

bash Miniconda3-latest-Linux-x86_64.sh

I like to create a separate conda environment for each of my projects. Example:

Create a conda environment called rstats and install R & the tidyverse packages from the r channel:

conda create -n rstats -c r r r-tidyverse

Before submitting jobs for your project rstats, activate the environment:

conda activate rstats

The packages installed in rstats are then available for any jobs you submit while the environment is activated.

See the conda user guide for more details and this tutorial on using conda on the cluster.

Misc Tips

Internet

The nodes on GreatLakes don't have internet access by default. If your job needs internet access, put this line in your submission script after the slurm options:

source /etc/profile.d/http_proxy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Great Lakes SLURM

Resources

SLURM Basics

Commands

Options

Examples

R scripts

Job Arrays

Dependent scheduling

Workflow

Multiprocessing

Conda

Misc Tips

Internet

Hardware

Files

README.md

Latest commit

History

README.md

File metadata and controls

Great Lakes SLURM

Resources

SLURM Basics

Commands

Options

Examples

R scripts

Job Arrays

Dependent scheduling

Workflow

Multiprocessing

Conda

Misc Tips

Internet

Hardware