Beautiful Soup doesn't find element

Question:

A website I am trying to scrape has this html tag (I believe its an A/B test which is why I have two BS4 searches going at the same time).

<h2 data-testid="price">39 777 kr</h2>

I am trying to scrape the text inside this h2 tag, but it doesn’t seem to work.

I have tried find_all, select and find but to no avail.

This is the full implementation:

soup = BeautifulSoup(response.text, 'html.parser')
total_price = soup.body.find('span', attrs='u-t3')
total_price_alternative = soup.body.find('h2', attrs={'data-testid': 'price'})
if total_price is not None:
    main_price_info = {
      'title': 'Total price',
      'value': total_price.text.replace(u'xa0', ' ')
    }
elif total_price_alternative is not None:
    main_price_info = {
      'title': 'Total price',
      'value': total_price_alternative.text.replace(u'xa0', ' ')
    }
else:
    main_price_info = {
      'title': 'Total price',
      'value': 'Could not find price'
    }

URL to the site (It’s in Norwegian): https://www.finn.no/car/used/ad.html?finnkode=297865903

Asked By: Snasegh

||

Answers:

The price is stored inside <script> element, so beautifulsoup doesn’t see it. You can use json.loads to parse this data:

import json
import requests
from bs4 import BeautifulSoup

url = 'https://www.finn.no/car/used/ad.html?finnkode=297865903'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')
data = json.loads(soup.select_one('#horseshoe-config').text)

price = data['xandr']['feed']['pris']
print(price)

Prints:

39777
Answered By: Andrej Kesely