I do not understand how to specify correctly the parameters on aSLURM cluster forsnakemake to use them. I tried submitting the followingSLURM file, but it does not work this way and the number of cores used is only1, not20:
#!/bin/bash#SBATCH -p standard#SBATCH -A overall #SBATCH --time=12:00:00#SBATCH --output=snakemake%A.out#SBATCH --error=snakemake%A.err#SBATCH --nodes=1#SBATCH --ntasks=1#SBATCH --cpus-per-task=20#SBATCH --mem=120000snakemakeThen, I tried followingsnakemaketutorial. And createdcluster.json based on theSLURM parameters that I need:
{ "__default__" : { "A" : "overall", "time" : "24:00:00", "nodes": 1, "ntasks": 1, "cpus" : 20, "p" : "standard", "mem": 120000, "output": "snakemake%A.out", "error": "snakemake%A.err" }}And ransnakemake inside a newly createdsnakemake.sh script:
#!/bin/bashsnakemake -j 999 --cluster-config cluster.json --cluster "sbatch -A {cluster.A} -p {cluster.p} \ -t {cluster.time} --output {cluster.output} --error {cluster.error} -- nodes {cluster.nodes} \--ntasks {cluster.ntasks} --cpus-per-task {cluster.cpus} --mem {cluster.mem}"And it is giving me an error now:
sbatch: error: Unable to open file
/bin/sh: line 1: -t: command not found Error submitting jobscript (exit code 127):
I am now completely lost at what I shoould actually do. I would prefer plain regular.slurm file submission, but how to makesnakemake use them? Any suggestions would be greatly appreciated.
I removed\ - line separators in thesnakemake.sh script:
#!/bin/bashsnakemake -j 10 --cluster-config cluster.json --cluster "sbatch -A {cluster.A} -p {cluster.p} -t {cluster.time} --output {cluster.output} --error {cluster.error} --nodes {cluster.nodes} --ntasks {cluster.ntasks} --cpus-per-task {cluster.cpus} --mem {cluster.mem}"And it started to run. It is not convenient though for me. I would rather prefer submitting just one job using.slurm file passing all of the parameters from#SBATCH. Is it possible?
- $\begingroup$What does the %A mean?$\endgroup$user1075– user10752019-02-27 22:59:15 +00:00CommentedFeb 27, 2019 at 22:59
- 1$\begingroup$username who is in charge of the cluster account computation resources, their payment. You can be this user, but it can also be that it is the other person, depending on who is paying for the cluster.$\endgroup$Nikita Vlasenko– Nikita Vlasenko2019-02-27 23:19:44 +00:00CommentedFeb 27, 2019 at 23:19
- $\begingroup$Can you use {rule} and {wildcards} in the output and error names in the slurm.json to name the output and error files of a job? so instead of "output": "snakemake%A.out", have ""output": "snakemake{rule}_{wilcards}.out"?$\endgroup$user1075– user10752019-03-09 00:12:32 +00:00CommentedMar 9, 2019 at 0:12
- $\begingroup$good question. idk, but i would also want to learn that since it simplifies things$\endgroup$Nikita Vlasenko– Nikita Vlasenko2019-03-09 02:35:15 +00:00CommentedMar 9, 2019 at 2:35
2 Answers2
You can omit--nodes, you need the following:
#!/bin/bash#SBATCH --ntasks-per-node=1#SBATCH -c threads_from_snakemake#SBATCH -p some_partitionCommands go hereFor slurm, you may want to modify mySlurmEasy script. We use that with snakemake all the time, usually of the form:
snakemake -c "SlurmEasy -t {threads} -n {rule} --mem-per-cpu {cluster.memory}" ...You'll also want units with your memory requests, like12G. Also, in general it's best not to submit snakemake itself to the cluster, rather run it on an interactive node and have it submit jobs on your behalf (if the admins complain point out that snakemake is barely using any resources on the head node).
While I agree with not submitting snakemake to the cluster, it's not a great solution to make your pilot program run interactively - automation via batch script should not be dependent on an open shell.
How about allocating a whole node with many cores and working within the GNU parallel abilities of snakemake.
Explore related questions
See similar questions with these tags.
