beautifulsoup | Page 2

Why does the break statement not work while scraping reviews with Selenium and BeautifulSoup in Python?

Why does the break statement not work while scraping reviews with Selenium and BeautifulSoup in Python? Question: I am scraping reviews with Selenium and BeautifulSoup in Python but the break statement does not work so that the while loop continues even after arriving at the last review page of a product. From what I understand …

Total answers: 1

How to get the "title" from the <span>

How to get the "title" from the <span> Question: How can I get the title "Product Manager" from the code below? <div class="new_job_name" data-v-99ef4628=""> <span data-v-99ef4628="">Product Manager</span> </div> If the title was in the <div class= "new_job_name">, I can get it using the code below: soup.find(class_="new_job_name").attrs["title"] Now I have no idea how to get the …

Total answers: 1

Beautiful Soup doesn't find element

Beautiful Soup doesn't find element Question: A website I am trying to scrape has this html tag (I believe its an A/B test which is why I have two BS4 searches going at the same time). <h2 data-testid="price">39 777 kr</h2> I am trying to scrape the text inside this h2 tag, but it doesn’t seem …

Total answers: 1

Removing top header from Dataframe

Removing top header from Dataframe Question: i would like to read table from following page : Countries by GDP i have tried pandas read_html command and got following result : import requests from bs4 import BeautifulSoup import pandas as pd data =pd.DataFrame(pd.read_html("https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)")[2]) print(data.head()) Country/Territory UN Region … United Nations[15] Country/Territory UN Region … Estimate Year …

Total answers: 2

Python: BeautifulSoup select_one cannot find the tag

Python: BeautifulSoup select_one cannot find the tag Question: English is my second language, please excuse me for poor English. Follow code is an easy code that gets tag info with using requests and bs4. The problem is, this code is returning none. import requests from bs4 import BeautifulSoup url = ‘http://ch1.skbroadband.com/content/view?parent_no=24&content_no=57&p_no=154494’ web = requests.get(url,headers={‘User-Agent’:’Mozilla/5.0′}) source …

Total answers: 1

Can't get texts out of a few dd tags that started after a certain dt tag

Can't get texts out of a few dd tags that started after a certain dt tag Question: I’m trying to get text out of dd tags located between the two dt tags. I’m interested in the text within dd tags that started after dt tag, which contains Bransje, until the next dt tag. The next …

Total answers: 1

Web Scraping a Text Using Python Gives Empty Output

Web Scraping a Text Using Python Gives Empty Output Question: I’m trying to get the affiliation text in this link https://www.sciencedirect.com/science/article/abs/pii/S001191642300142X These are the elements I work on <dl class="affiliation"><dt><sup>a</sup></dt><dd>Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo, 21, 00128 Rome, Italy</dd></dl> <dl class="affiliation"><dt><sup>b</sup></dt><dd>Department of Chemical Sciences, University of Naples Federico II, …

Total answers: 1

I can't find the correct tags to scrape the class name, code, and description (description is via link)

I can't find the correct tags to scrape the class name, code, and description (description is via link) Question: I’m brand new to scraping. I’m trying to scrape the class code, name, and description from this website: URL = https://catalog.registrar.ucla.edu/search?parentAcademicOrg=7e561ea0db6fa0107f1572f5f39619b1&ct=subject No matter what I set my divs = soup.find_all(), nothing seems to print (eventually I …

Total answers: 1

Beautifulsoup not extracting a <a> tag from a dynamic website

Beautifulsoup not extracting a <a> tag from a dynamic website Question: I have a website https://dip.bundestag.de/aktivit%C3%A4t/Dr–Holger-Becker-MdB-SPD/1628877 and I want to extract the HTML connected to "BT-Plenarprotokoll 20/86, S. 10313C". The HTML chunk is: <a title="PDF Bundestags-Plenarprotokoll öffnen" aria-label="BT-Plenarprotokoll" href="https://dserver.bundestag.de/btp/20/20086.pdf#P.10313" target="_self" class="hsbfb4-0 sc-1xaeas4-1 hTYfHF FZiNn"><svg viewBox="0 0 10 12" class="sc-1c5ggr5-17 cYBAUx"><g stroke="currentColor" fill="none" fill-rule="evenodd"><path d="M6.14.5H.5v11h9V3.86z"></path><path d="M5.56 …

Total answers: 1

Using BeautifulSoup in a BeeWare app gives `ModuleNotFoundError: No module named 'bs4'

Using BeautifulSoup in a BeeWare app gives `ModuleNotFoundError: No module named 'bs4' Question: I am trying to import beautifulsoup4in a BeeWare app (with its own virtual environment) using the command: from bs4 import BeautifulSoup But I get a ModuleNotFoundError: No module named ‘bs4’ even though I have installed beautifulsoup4 in my virtual environment and added …

Total answers: 1