Prevent rules from rerunning when intermediate file is updated
Question:
Let’s say I have two rules in my snakemake file
- The first rule fetches a remote file and makes a temporary local copy
- The second rule uses the local file and performs an expensive task
Now lets say I ran this pipeline to completion and I wanted to add a third rule and re-run the pipeline.
- The third rule uses the same local file and performs a different task
Is there a way I can run this updated pipeline without rerunning rule #2? The issue is that when I attempt to complete rule #3, rule #1 is triggered and then rule #2 wants to re-run because the intermediate local file has been updated.
I know that techniques like using touch
or ancient
exist, but I’m not sure how or even if they can apply here. Is there a way to specifically tag rule #1 as not making an update?
Answers:
Wrapping the input files for rules 2 and 3 in ancient
should prevent them from reacting to file updates. Something like this:
rule a:
output: 'a.txt'
shell: 'curl some_url > {output}'
rule b:
input: ancient('a.txt')
# do something
rule c:
input: ancient('a.txt')
# do something
Let’s say I have two rules in my snakemake file
- The first rule fetches a remote file and makes a temporary local copy
- The second rule uses the local file and performs an expensive task
Now lets say I ran this pipeline to completion and I wanted to add a third rule and re-run the pipeline.
- The third rule uses the same local file and performs a different task
Is there a way I can run this updated pipeline without rerunning rule #2? The issue is that when I attempt to complete rule #3, rule #1 is triggered and then rule #2 wants to re-run because the intermediate local file has been updated.
I know that techniques like using touch
or ancient
exist, but I’m not sure how or even if they can apply here. Is there a way to specifically tag rule #1 as not making an update?
Wrapping the input files for rules 2 and 3 in ancient
should prevent them from reacting to file updates. Something like this:
rule a:
output: 'a.txt'
shell: 'curl some_url > {output}'
rule b:
input: ancient('a.txt')
# do something
rule c:
input: ancient('a.txt')
# do something