Element not located BeautifulSoup
Question:
I want to get the overview part on the webpage. But in return I am getting an empty list. Where am I making a mistake? I tried the same with Selenium, it displays an error:
no such element: Unable to locate element
My code. It returns title and description but not the overview.
def next_page(link):
page_response = requests.get('https://www.nilfisk.com/'+link, headers=headers)
page_soup = BeautifulSoup(page_response.text, 'html.parser')
title = page_soup.select_one('h1.product__name-container').text
description = page_soup.select_one('#productContentDescriptionContainerId p').text
print(title, ': ', description)
overview = page_soup.find_all('div', class_='inner-html')
print(title, ': ', str(overview))
def main():
url = 'https://www.nilfisk.com/en-au/search/?q=13300148'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
product = soup.select_one('div.sitesearchresult__row div.col-8 a').attrs['href']
next_page(product)
if __name__ == '__main__':
main()
Answers:
The content is rendered dynamically by javascript
, so you have to use selenium
and waits
or could extract it from the <script>
while converting string to JSON and pick by key:
description = BeautifulSoup(json.loads('{'+page_soup.select_one('[id^="ProductDescription"] + script').text.split('',{')[-1].split(',true')[0])['descriptionContent']).text
Because OP is missing details and expected output, this is just pointing in a direction.
Example
import requests, json
from bs4 import BeautifulSoup
def next_page(link):
page_response = requests.get('https://www.nilfisk.com'+link)
page_soup = BeautifulSoup(page_response.text, 'html.parser')
title = page_soup.select_one('h1.product__name-container').text
description = BeautifulSoup(json.loads('{'+page_soup.select_one('[id^="ProductDescription"] + script').text.split('',{')[-1].split(',true')[0])['descriptionContent']).text
print(title, ': ', description)
def main():
url = 'https://www.nilfisk.com/en-au/search/?q=13300148'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
product = soup.select_one('div.sitesearchresult__row div.col-8 a').attrs['href']
next_page(product)
if __name__ == '__main__':
main()
Output
SR1601 D Maxi :
SR1601 is claimed to be the most technically advanced sweeper in its category, with innovations that increase cleaning productivity whilst simplifying operating and servicing requirements. Designed for industrial applications such as heavily soiled factories, construction sites, car parks etc
The dust control is outstanding and unbeatable in the industry. The DustGuardTM system further reduces airborne particles. Less dust means less cleaning elsewhere. The productivity is excellent, thanks to a huge 900 mm main broom, a 1600 mm wide sweeping path, and all around reliability for getting the job done quickly and efficiently.
Available in Diesel, LPG and Battery Models.
I want to get the overview part on the webpage. But in return I am getting an empty list. Where am I making a mistake? I tried the same with Selenium, it displays an error:
no such element: Unable to locate element
My code. It returns title and description but not the overview.
def next_page(link):
page_response = requests.get('https://www.nilfisk.com/'+link, headers=headers)
page_soup = BeautifulSoup(page_response.text, 'html.parser')
title = page_soup.select_one('h1.product__name-container').text
description = page_soup.select_one('#productContentDescriptionContainerId p').text
print(title, ': ', description)
overview = page_soup.find_all('div', class_='inner-html')
print(title, ': ', str(overview))
def main():
url = 'https://www.nilfisk.com/en-au/search/?q=13300148'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
product = soup.select_one('div.sitesearchresult__row div.col-8 a').attrs['href']
next_page(product)
if __name__ == '__main__':
main()
The content is rendered dynamically by javascript
, so you have to use selenium
and waits
or could extract it from the <script>
while converting string to JSON and pick by key:
description = BeautifulSoup(json.loads('{'+page_soup.select_one('[id^="ProductDescription"] + script').text.split('',{')[-1].split(',true')[0])['descriptionContent']).text
Because OP is missing details and expected output, this is just pointing in a direction.
Example
import requests, json
from bs4 import BeautifulSoup
def next_page(link):
page_response = requests.get('https://www.nilfisk.com'+link)
page_soup = BeautifulSoup(page_response.text, 'html.parser')
title = page_soup.select_one('h1.product__name-container').text
description = BeautifulSoup(json.loads('{'+page_soup.select_one('[id^="ProductDescription"] + script').text.split('',{')[-1].split(',true')[0])['descriptionContent']).text
print(title, ': ', description)
def main():
url = 'https://www.nilfisk.com/en-au/search/?q=13300148'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
product = soup.select_one('div.sitesearchresult__row div.col-8 a').attrs['href']
next_page(product)
if __name__ == '__main__':
main()
Output
SR1601 D Maxi :
SR1601 is claimed to be the most technically advanced sweeper in its category, with innovations that increase cleaning productivity whilst simplifying operating and servicing requirements. Designed for industrial applications such as heavily soiled factories, construction sites, car parks etc
The dust control is outstanding and unbeatable in the industry. The DustGuardTM system further reduces airborne particles. Less dust means less cleaning elsewhere. The productivity is excellent, thanks to a huge 900 mm main broom, a 1600 mm wide sweeping path, and all around reliability for getting the job done quickly and efficiently.
Available in Diesel, LPG and Battery Models.