Scrape the feature image from this website but it returns this `data:image/gif

Question

Using Scrapy and Scrapy shell in python to scrape the feature image from this website https://www.thrillist.com/travel/nation/all-the-ways-to-cool-off-in-austin but it returns this data:image/gif;base64,R0 instead of src of the image, I need the help of someone if any one tell me the way to fix this to get src of the image

Here is my Code

Feature_Image = [i.strip() for i in response.xpath('//*[@id="main-content"]/article/div/div/div[2]/div[1]/picture/img/@src').getall()][0]

Asked By: Info Rewind

||

Source

Answer 1

It looks like the tag has a data-src attribute that holds the link and some image attributes. Parsing the text and extracting the first section get’s you the link.

>>> link = response.xpath("//div[@data-element-type='ParagraphMainImage']//img/@data-src").get().split(";")[0]
>>> link
'https://assets3.thrillist.com/v1/image/3086882/414x310/crop'

You can add manually add .jpg to the end if you want to be able to differentiate what type of image it is. The link works with and without the extension.

Answered By: Alexander

Answer 2

The biggest image on that page would be the one marked (somehow) for Desktop – common sense logic. So why not try to locate its source like below?

pic = response.xpath('//picture[@data-testid="picture-tag"]//source[@data-size="desktop"]/@srcset').get()

Result is the source for the biggest size for that page poster:

https://assets3.thrillist.com/v1/image/3086882/1584x1056/crop;webp=auto;jpeg_quality=60;progressive.jpg

Answered By: Barry the Platipus

Scrape the feature image from this website but it returns this `data:image/gif

Question:

Answers: