Python BeautifulSoup soup.find

Question

I want to scrape some specific data from a website using urllib and BeautifulSoup.
Im trying to fetch the text "190.0 kg". I have tried as you can see in my code to use attrs={'class': 'col-md-7'}
but this returns the wrong result. Is there any way to specify that I want it to return the text between <h3>?

from urllib.request import urlopen
from bs4 import BeautifulSoup

# specify the url
quote_page = 'https://.com'

# query the website and return the html to the variable 'page'    
page = urlopen(quote_page)

# parse the html using beautiful soup     
soup = BeautifulSoup(page, 'html.parser')

# take out the <div> of name and get its value    
Weight_box = soup.find('div', attrs={'class': 'col-md-7'})

name = name_box.text.strip() 
print (name)

Asked By: laks

||

Source

Answer 1

Since this content is dynamically generated there is no way to access that data using the requests module.

You can use selenium webdriver to accomplish this:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup

chrome_options = Options()
chrome_options.add_argument("--headless")

chrome_driver = "path_to_chromedriver"

driver = webdriver.Chrome(chrome_options=chrome_options,executable_path=chrome_driver)
driver.get('https://styrkeloft.no/live.styrkeloft.no/v2/?test-stevne')
html = driver.page_source
soup = BeautifulSoup(html, "lxml")
current_lifter = soup.find("div", {"id":"current_lifter"})
value = current_lifter.find_all("div", {'class':'row'})[2].find_all("h3")[0].text
driver.quit()

print(value)

Just be sure to have the chromedriver executable in your machine.

Answered By: drec4s

Python BeautifulSoup soup.find

Question:

Answers: