Snakemake scatter-gather with wildcard AmbiguousRuleException

Question:

My problem is when using Snakemake scatter-gather feature the documentation is basic and i modified my code according to mentioned in this link:

rule fastq_fasta:
    input:rules.trimmomatic.output.out_file
    output:"data/trimmed/{sample}.fasta"
    shell:"sed -n '1~4s/^@/>/p;2~4p' {input} > {output}"

rule split:
    input:
        "data/trimmed/{sample}.fasta"
    params:
        scatter_count=config["scatter_count"],
        scatter_item = lambda wildcards: wildcards.scatteritem
    output:
        temp(scatter.split("data/trimmed/{{sample}}_{scatteritem}.fasta"))
    script:
        "scripts/split_files.py"
        
rule process:
    input:"data/trimmed/{sample}_{scatteritem}.fasta"
    output:"data/processed/{sample}_{scatteritem}.csv"
    script:
        "scripts/process.py"

rule gather:
    input:
        gather.split("data/processed/{{sample}}_{scatteritem}.csv")
    output:
        "data/processed/{sample}.csv"
    shell:
        "cat {input} > {output}"

I added wildcard option but, I got:

AmbiguousRuleException: Rules fastq_to_fasta(which is previous rule) and split are ambiguous for the file data/trimmed/Ornek_411-of-81-of-81-of-81-of-81-of-81-of-81-of-81-of-81-of-8.fasta

I tried lots of things but either rules are not calling or take AmbiguousRuleException. What am i missing, can someone help?

Asked By: sahin

||

Answers:

There is ambiguity in terms of which rule should generate the specific file. An easy fix (if feasible) is to use a different path for scattered items:

rule split:
    input:
        "data/trimmed/{sample}.fasta"
    params:
        scatter_count=config["scatter_count"],
        scatter_item = lambda wildcards: wildcards.scatteritem
    output:
        temp(scatter.split("data/trimmed_scatter/{{sample}}_{scatteritem}.fasta"))
    script:
        "scripts/split_files.py"
        
rule process:
    input:"data/trimmed_scatter/{sample}_{scatteritem}.fasta"
    output:"data/processed/{sample}_{scatteritem}.csv"
    script:
        "scripts/process.py"
Answered By: SultanOrazbayev