How to do a task after scraping all the pages of website using Scrapy-Python

Question:

I want to perform some task after my scraper scrapes all the anchors of a home page of a website. But the print statement is executed before processing the parse_details of all pages.

Any HELP would be appreciated. Thanks in advance

    
    def parse_site(self,response):
        next_links = response.css('a::attr(href)').getall()
       
        for next_link in next_links:
              yield response.follow(next_link,callback=self.parse_detail)
        print("Task after complettion of all pages")
       
     def parse_detail(self,response):
        
        print("@@@@@@@@@@@@@@@@@GETTING HERE################")
        all_content = response.xpath('//body').extract()
        print("###############")
        print(response.url)
Asked By: Rupal Shah

||

Answers:

You can add the method closed to your spider which will be called by scrapy after your spider is done. However, you can not yield any more items in the method. Scrapy docs

def closed(self, reason):
    # do something here.
    pass
Answered By: Felix Eklöf
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.