scrapy is there a way to print json file without using -o -t parameters
Question:
I am usually calling my spider like this:
scrapy crawl Spider -o fileName -t json
and I got the correct data printed in the fileName
file as json formated.
Now I want to call my spider like this:
scrapy crawl Spider
my question
is there a way to print the output to a file without using the -o -t parameters?
Answers:
Yes it can be done. add this to your settings
FEED_EXPORTERS = {
'jsonlines': 'scrapy.contrib.exporter.JsonLinesItemExporter',
}
FEED_FORMAT = 'jsonlines'
FEED_URI = "NAME_OF_FILE.json"
For reference
Here is how i did it in scrapy 2.6.1
def open_spider(self, spider: YellowpagesCategorySpiderSpider):
feeds = spider.settings.attributes['FEEDS'].value
output_file_names = list(feeds)
if len(output_file_names) > 1:
raise RuntimeError(f"Only one output file is allowed, but {len(output_file_names)} were found")
self.output_file_name = output_file_names[0]
I am usually calling my spider like this:
scrapy crawl Spider -o fileName -t json
and I got the correct data printed in the fileName
file as json formated.
Now I want to call my spider like this:
scrapy crawl Spider
my question
is there a way to print the output to a file without using the -o -t parameters?
Yes it can be done. add this to your settings
FEED_EXPORTERS = {
'jsonlines': 'scrapy.contrib.exporter.JsonLinesItemExporter',
}
FEED_FORMAT = 'jsonlines'
FEED_URI = "NAME_OF_FILE.json"
For reference
Here is how i did it in scrapy 2.6.1
def open_spider(self, spider: YellowpagesCategorySpiderSpider):
feeds = spider.settings.attributes['FEEDS'].value
output_file_names = list(feeds)
if len(output_file_names) > 1:
raise RuntimeError(f"Only one output file is allowed, but {len(output_file_names)} were found")
self.output_file_name = output_file_names[0]