Scrapy CrawlerRunner: Output missing

Question

I have been using the method described on stackoverflow (https://stackoverflow.com/a/43661172/5037146) , to make scrapy run from script using Crawler Runner to allow to restart the process.

However, I don’t get any console logs when running the process through CrawlerRunner, whereas when I using CrawlerProcess, it outputs the status and progress.

Code is available online: https://colab.research.google.com/drive/14hKTjvWWrP–h_yRqUrtxy6aa4jG18nJ

Asked By: Aerodynamic

||

Source

Answer 1

With CrawlerRunner you need to manually setup logging, which you can do with configure_logging(). See https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script

Answered By: Gallaecio

Answer 2

When you use CrawlerRunner you have to manually configure a logger
You can do it using scrapy.utils.log.configure_logging function

for example

import scrapy.crawler
from my_spider import MySpider

runner = scrapy.crawler.CrawlerRunner()
scrapy.utils.log.configure_logging(
            {
                "LOG_FORMAT": "%(levelname)s: %(message)s",
            },
        )
crawler = runner.create_crawler(MySpider)
crawler.crawl()

Answered By: Alon Barad

Scrapy CrawlerRunner: Output missing

Question:

Answers: