Prevent rules from rerunning when intermediate file is updated

Question:

Let’s say I have two rules in my snakemake file

  1. The first rule fetches a remote file and makes a temporary local copy
  2. The second rule uses the local file and performs an expensive task

Now lets say I ran this pipeline to completion and I wanted to add a third rule and re-run the pipeline.

  1. The third rule uses the same local file and performs a different task

Is there a way I can run this updated pipeline without rerunning rule #2? The issue is that when I attempt to complete rule #3, rule #1 is triggered and then rule #2 wants to re-run because the intermediate local file has been updated.

I know that techniques like using touch or ancient exist, but I’m not sure how or even if they can apply here. Is there a way to specifically tag rule #1 as not making an update?

Asked By: ScottMastro

||

Answers:

Wrapping the input files for rules 2 and 3 in ancient should prevent them from reacting to file updates. Something like this:

rule a:
     output: 'a.txt'
     shell: 'curl some_url > {output}'

rule b:
     input: ancient('a.txt')
     # do something

rule c:
     input: ancient('a.txt')
     # do something
Answered By: SultanOrazbayev