Using the Great Lakes cluster and batch computing with SLURM
- Slides: Introduction to the Great Lakes cluster and batch computing with SLURM
- Slides: Advanced batch computing with SLURM on the Great Lakes cluster
- Slides: MPI profiling with Allinea MAP
- ARC-TS: Great Lakes overview
- ARC-TS: Great Lakes Cheat Sheet
- ARC-TS: SLURM user guide
- ARC-TS: Migrating from PBS-Torque to SLURM
- ARC-TS: Globus high-speed data transfer
- Kelly's example of using Snakemake on HPC
- Snakemake profile for SLURM
- conda on the cluster
Command | Action |
---|---|
sbatch script.sh |
Submit script.sh as a job |
squeue -j jobid |
Check job status by jobid |
squeue -u uniqname |
Check job status by user's uniqname |
scancel jobid |
Kill a job by jobid |
my_usage uniqname |
List resource usage for user uniqname |
sinfo |
Show node status by partition |
scontrol show node node_name |
Show details for a node node_name |
scontrol show job jobid |
Show details for a job by jobid |
srun --pty --nodes=1 --cpus-per-task=4 --time=30:00 --account=training /bin/bash |
Run an interactive job |
seff jobid |
Show total time and memory usage for job (plus other things) |
Note that everything on Great Lakes will be on-demand. For memory, CPU, & GPU, you will be charged for the resources you ask for. However, you will only be charged for the walltime your jobs use.
Take a look at the SLURM user guide from ARC-TS for a list of available options. Also see this guide for migrating your PBS-torque scripts to SLURM.
These options go in your submission scripts (example). All lines with SLURM options start with #SBATCH
. With the exception of the hashbang (#!
), anything else starting with #
is a comment.
More example files are in /scratch/data/workshops/IntroGreatLakes/
on the beta login node (beta-login.stage.arc-ts.umich.edu
).
-
Edit your R script,
Rbatch.R
, with your preferred text editor. -
Edit the submission script,
Rbatch.sh
. -
Load R and submit the job.
module load R sbatch Rbatch.sh
It will tell you the
jobid
in a message:Submitted batch job 32965
. -
Check on the status of your jobs.
squeue -u uniqname
-
When it finishes, take a look at the output from R.
less Rbatch.out
-
To troubleshoot problems, look at the SLURM log file.
less slurm-32965.out
where
32965
is thejobid
.
The matlab script arr.m
takes a job id as input and works on only one task.
The submission script submit.sh
sets up the job array with three tasks and runs the matlab script once per task. To make a job array, use the sbatch command #SBATCH --array=1-3
. Edit the integers 1
and 3
to modify the number of tasks in the array and the numbers they're assigned.
Submit the job with:
module load matlab
sbatch submit.sh
-
Submit a job at a given time:
1 minute before New Year's Day 2020:
sbatch --begin 2019-12-31T23:59:00 j1.sbat
At the next 6pm:
sbatch --begin 18:00 j2.sbat
-
Submit a job after another job completes:
JOBID=`sbatch --parsable first.sbat` # JOBID <- first’s jobid sbatch --dependency=afterany:$JOBID second.sbat
An example using Trinity RNA-seq: examples/trinity/
The submission script trinity.sbat
contains lots of boilerplate code to handle intermediate directories & files. If you find yourself writing complicated bash scripts like this, consider whether you should instead use a proper workflow manager such as Snakemake. See a minimal example of using Snakemake on the HPC.
Rather than using the modules provided, I prefer to use conda to manage my software dependencies.
Download the latest installer for Anaconda (includes everything) or Miniconda (includes only the minimum, but faster to install).
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Run the installer:
bash Miniconda3-latest-Linux-x86_64.sh
I like to create a separate conda environment for each of my projects. Example:
Create a conda environment called rstats
and install R & the tidyverse packages from the r
channel:
conda create -n rstats -c r r r-tidyverse
Before submitting jobs for your project rstats
, activate the environment:
conda activate rstats
The packages installed in rstats
are then available for any jobs you submit while the environment is activated.
See the conda user guide for more details and this tutorial on using conda on the cluster.
The nodes on GreatLakes don't have internet access by default. If your job needs internet access, put this line in your submission script after the slurm options:
source /etc/profile.d/http_proxy.sh