Running SnakeMake on cluster

Question 1

I do not understand how to specify correctly the parameters on aSLURM cluster forsnakemake to use them. I tried submitting the followingSLURM file, but it does not work this way and the number of cores used is only1, not20:

#!/bin/bash#SBATCH -p standard#SBATCH -A overall #SBATCH --time=12:00:00#SBATCH --output=snakemake%A.out#SBATCH --error=snakemake%A.err#SBATCH --nodes=1#SBATCH --ntasks=1#SBATCH --cpus-per-task=20#SBATCH --mem=120000snakemake

Then, I tried followingsnakemaketutorial. And createdcluster.json based on theSLURM parameters that I need:

{    "__default__" :    {        "A" : "overall",        "time" : "24:00:00",        "nodes": 1,        "ntasks": 1,        "cpus" : 20,        "p" : "standard",        "mem": 120000,        "output": "snakemake%A.out",        "error": "snakemake%A.err"    }}

And ransnakemake inside a newly createdsnakemake.sh script:

#!/bin/bashsnakemake -j 999 --cluster-config cluster.json --cluster "sbatch -A {cluster.A} -p {cluster.p} \ -t {cluster.time} --output {cluster.output} --error {cluster.error} -- nodes {cluster.nodes} \--ntasks {cluster.ntasks} --cpus-per-task {cluster.cpus} --mem {cluster.mem}"

And it is giving me an error now:

sbatch: error: Unable to open file
/bin/sh: line 1: -t: command not found Error submitting jobscript (exit code 127):

I am now completely lost at what I shoould actually do. I would prefer plain regular.slurm file submission, but how to makesnakemake use them? Any suggestions would be greatly appreciated.

I removed\ - line separators in thesnakemake.sh script:

#!/bin/bashsnakemake -j 10 --cluster-config cluster.json --cluster "sbatch -A {cluster.A} -p {cluster.p} -t {cluster.time} --output {cluster.output} --error {cluster.error} --nodes {cluster.nodes} --ntasks {cluster.ntasks} --cpus-per-task {cluster.cpus} --mem {cluster.mem}"

And it started to run. It is not convenient though for me. I would rather prefer submitting just one job using.slurm file passing all of the parameters from#SBATCH. Is it possible?

Question 2

What does the %A mean?

Question 3

username who is in charge of the cluster account computation resources, their payment. You can be this user, but it can also be that it is the other person, depending on who is paying for the cluster.

Question 4

Can you use {rule} and {wildcards} in the output and error names in the slurm.json to name the output and error files of a job? so instead of "output": "snakemake%A.out", have ""output": "snakemake{rule}_{wilcards}.out"?

Question 5

good question. idk, but i would also want to learn that since it simplifies things

Question 6

You can omit--nodes, you need the following:

#!/bin/bash#SBATCH --ntasks-per-node=1#SBATCH -c threads_from_snakemake#SBATCH -p some_partitionCommands go here

For slurm, you may want to modify mySlurmEasy script. We use that with snakemake all the time, usually of the form:

snakemake -c "SlurmEasy -t {threads} -n {rule} --mem-per-cpu {cluster.memory}" ...

You'll also want units with your memory requests, like12G. Also, in general it's best not to submit snakemake itself to the cluster, rather run it on an interactive node and have it submit jobs on your behalf (if the admins complain point out that snakemake is barely using any resources on the head node).

Question 7

While I agree with not submitting snakemake to the cluster, it's not a great solution to make your pilot program run interactively - automation via batch script should not be dependent on an open shell.

How about allocating a whole node with many cores and working within the GNU parallel abilities of snakemake.

Devon Ryan 19.8k2 gold badges31 silver badges61 bronze badges · Accepted Answer · 2018-09-05 07:05:38Z

You can omit--nodes, you need the following:

#!/bin/bash#SBATCH --ntasks-per-node=1#SBATCH -c threads_from_snakemake#SBATCH -p some_partitionCommands go here

For slurm, you may want to modify mySlurmEasy script. We use that with snakemake all the time, usually of the form:

snakemake -c "SlurmEasy -t {threads} -n {rule} --mem-per-cpu {cluster.memory}" ...

You'll also want units with your memory requests, like12G. Also, in general it's best not to submit snakemake itself to the cluster, rather run it on an interactive node and have it submit jobs on your behalf (if the admins complain point out that snakemake is barely using any resources on the head node).

Movatterモバイル変換

Stack Exchange Network

Running SnakeMake on cluster

2 Answers2

Your Answer

Sign up orlog in

Post as a guest

Related

Hot Network Questions

Subscribe to RSS