Snakemake can submit each of your jobs to the slurm scheduler for you! To enable this, you need to provide the --cluster
option to snakemake on the command line, and include all of the sbatch
information you normally put at the top of your submission files.
snakemake --cluster "sbatch -A CLUSTER_ACCOUNT -t CLUSTER_TIME -p CLUSTER_PARTITION -N CLUSTER_NODES -J JOBNAME" --jobs NUM_JOBS_TO_SUBMIT
Notes:
- Most clusters would prefer that you use an interactive session (or sbatch) to run this, so that you're not running anything on the login nodes. Since this process is only submitting jobs, you can run this command on tmux/screen on a login node, but only do it for a small number of jobs or you'll slow everyone up and your job will probably be killed by admin.
- the
--jobs
parameter allows snakemake to submit up to NUM_JOBS_TO_SUBMIT number of jobs, but please be aware of submission limits on your cluster. By default, snakemake will only submit jobs that can be run (input files already exist). There is a parameter called--immediate-submit
that will submit all jobs at once, but this may be an issue if the input files for those jobs are not available when those jobs make it through the scheduling queue.
To save time you can also make a yml file containing your sbatch information, and tell snakemake where to find it.
Here's an example cluster configuration file:
# cluster_config.yml - cluster configuration
__default__:
account: ctbrowngrp
partition: bml
time: 1-00:05:00 # time limit per job (1 day, 5 min; fmt = dd-hh:mm:ss)
nodes: 1 # per job
ntasks-per-node: 1 # per job
chdir: /home/ctbrown/charcoal # working directory for batch script
output: slurm-%j.out
error: slurm-%j.err
snakemake will use the values in this file to fill in the parameters for job submission (see below command line, where cluster.time
will be filled in as
1-00:05:00
from the above config file).
Even if you tell snakemake where to find this file, it's not going to use all of these parameters to submit each job - it will only use the ones you specify in the sbatch
portion of your --cluster
statement.
Here, ctbrowngrp
should correspond to the buyin account. The
partitions (queues) are described
here.
To run snakemake so that it submits jobs with parameters from the config file, run snakemake like so:
snakemake --cluster "sbatch -A {cluster.account} -t {cluster.time} \
-p {cluster.partition} -N {cluster.nodes}" \
--cluster-config cluster_config.yml --jobs 8
The information within the {}
are the parameters that snakemake will
read from the cluster_config.yml
file. Here we are telling snakemake to
run up to 8 jobs simultaneously.
In this configuration file above, I only have information for a
__default__
, which will be used as the default for each rule. If you
want to set specific time limits for each rule (or some rules), you
can add that info to the file.
For example, if I have a rule called trimmomatic_raw
, I could add
the following to my cluster_config.yml
file to specify some
different cluster parameters for that rule.
trimmomatic_raw:
time: 00:45:00 # time limit for this rule only
For non-slurm clusters, you can change the cluster
command to
reflect the scheduling service your cluster uses. See snakemake's
documentation for examples.
In some cases when a slurm job is preempted (eg. by higher priority jobs), snakemake recognizes it as having failed and stop execution of all subsequent jobs. To gracefully handle such cases, snakemake allows a custom script to check job status via --cluster-status
. It accepts a python script that prints one of the three statuses: success
, running
, and failed
. Below is an example script that works with Farm:
#!/usr/bin/python
import sys, time, subprocess
jobid = sys.argv[1]
def check_state(output):
running_status=["PENDING", "CONFIGURING", "COMPLETING", "RUNNING", "SUSPENDED", "PREEMPTED"] # Tell snakemake to wait if job has one of these states
if "COMPLETED" in output:
print("success")
elif any(r in output for r in running_status):
print("running")
else:
print("failed")
attempts = 5
delay = 3 # wait time (s) between attempts
for i in range(attempts):
try:
output = str(subprocess.check_output("sacct -j %s --format State --noheader | head -1 | awk '{print $1}'" % jobid, shell=True, universal_newlines = True).strip())
if output:
check_state(output)
exit(0)
except Exception as e: # sacct command sometimes seems to fail with communication errors or return null for first ~10s after a job submission, in that case, scontrol can be used
print("sacct error:" + e, file=sys.stderr)
try:
output = subprocess.run("scontrol show job -o " + str(jobid), capture_output = True, shell = True, universal_newlines = True)
if output.stdout:
info = {i.split('=')[0]: i.split('=')[1] for i in output.stdout.strip().split(' ')}
check_state(info['JobState'])
exit(0)
except Exception as e: # a jobid stays in scontrol for an unspecififed amount (possibly undeterministic) after completion, in that case, sacct is approriate
print(e, file=sys.stderr)
time.sleep(delay)
if i >= attempts - 1:
print('failed')
source ~/.bashrc
conda activate snakemake
cd /home/ntpierce/2019-burgers-shrooms/orthofinder_work
snakemake -s diamond_blast.snakefile --use-conda --cluster "sbatch -t 0:30:00 -N 1 -c 14 -J dmnd --mem=30gb " --jobs 5
#!/bin/bash -login
#SBATCH -D /home/ntpierce/2019-burgers-shrooms/orthofinder_work
#SBATCH -J dmnd_snake
#SBATCH -t 3-0:00:00
#SBATCH -N 1
#SBATCH --output /home/ntpierce/2019-burgers-shrooms/orthofinder_work/dmnd_snake-%j.out
#SBATCH --error /home/ntpierce/2019-burgers-shrooms/orthofinder_work/dmnd-snake-%j.err
# activate conda in general
source /home/ntpierce/.bashrc # if you have the conda init setting
# activate a specific conda environment, if you so choose
conda activate snakemake
# go to a particular directory
cd /home/ntpierce/2019-burgers-shrooms/orthofinder_work
# make things fail on errors
set -o nounset
set -o errexit
set -x
### run your commands here!
snakemake -s diamond_blast.snakefile --use-conda --cluster "sbatch -t 0:30:00 -N 1 -c 14 -J dmnd --mem=30gb " --jobs 5
A simple fully functioning example for the farm cluster is here.
Here's a carpentries
tutorial you
might find helpful. Note that this tutorial has a json
-formatted
cluster configuration file. json
and yaml
files are read
identically by snakemake, but I find yaml
to be more human-friendly!
You can use either.
Take a look at the snakemake documention for cluster execution here.
Tessa has written a nice blogpost about using Snakemake Profiles, too.