How to get the number of total scattergather items in Snakemake scatter/gather?
Question:
I’m trying out Snakemake’s scatter/gather inbuilts but am stumbling over how to get the number of total splits configured.
The documentation doesn’t mention how I can access that variable as defined in the workflow or passed through CLI.
Docs say I should define a scattergather directive:
scattergather:
split=8
But how do I get the value of split
which is 8
in this case inside my split rule where I would assign it to params.split_total
?
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = config["scattergather"]["split"]
shell: "split -l {params.split_total} input"
This fails with: KeyError 'scattergather'
Am I missing something obvious? This is the docs I’m looking at: KeyError in line 48 of /Users/corneliusromer/code/ncov-ingest/workflow/snakemake_rules/curate.smk:
2 ‘scattergather’
Answers:
There is a possibility of accessing specific setting via workflow
internal property ._scatter
:
scattergather:
split=8
# downstream rule can refer to the python variable
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = workflow._scatter["split"]
shell: "split -l {params.split_total} input"
This will dynamically change when CLI param set-scatter
is provided.
For other cases, one could leverage python. In the snippet below this is done via setting a specific value, however any valid way to set/obtain value in python will work:
# python variable/label
split_total = 8
scattergather:
split=split_total
# downstream rule can refer to the python variable
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = split_total
shell: "split -l {params.split_total} input"
I’m trying out Snakemake’s scatter/gather inbuilts but am stumbling over how to get the number of total splits configured.
The documentation doesn’t mention how I can access that variable as defined in the workflow or passed through CLI.
Docs say I should define a scattergather directive:
scattergather:
split=8
But how do I get the value of split
which is 8
in this case inside my split rule where I would assign it to params.split_total
?
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = config["scattergather"]["split"]
shell: "split -l {params.split_total} input"
This fails with: KeyError 'scattergather'
Am I missing something obvious? This is the docs I’m looking at: KeyError in line 48 of /Users/corneliusromer/code/ncov-ingest/workflow/snakemake_rules/curate.smk:
2 ‘scattergather’
There is a possibility of accessing specific setting via workflow
internal property ._scatter
:
scattergather:
split=8
# downstream rule can refer to the python variable
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = workflow._scatter["split"]
shell: "split -l {params.split_total} input"
This will dynamically change when CLI param set-scatter
is provided.
For other cases, one could leverage python. In the snippet below this is done via setting a specific value, however any valid way to set/obtain value in python will work:
# python variable/label
split_total = 8
scattergather:
split=split_total
# downstream rule can refer to the python variable
rule split:
input: "input.txt"
output: scatter.split("splitted/{scatteritem}.txt")
params: split_total = split_total
shell: "split -l {params.split_total} input"