beautifulsoup

NoneType' object is not subscriptable : bs4 task fails permanently

NoneType' object is not subscriptable : bs4 task fails permanently Question: update: tried the scripts of Driftr95 .. in google-colab – and got some questions – the scripts failed – and was not succesful – queston. at the beginning of the scripts i have noticed that some lines are commendted out. why is this so. …

Total answers: 2

Multiple H3 tags – but only need a specific one with web scraping

Multiple H3 tags – but only need a specific one with web scraping Question: How do I target a text within specific H3 tags if there are multiple H3 tags? I’m currently trying the below code but it only returns the first H3 tag with the string "1" instead of the second one with the …

Total answers: 1

Get ALL strings from html

Get ALL strings from html Question: I send get requests to different sites. In response I get HTML pages. How can I get only strings from the HTML page? I mean all strings in general (the ones colored white in my screenshot). I understand how I can get "div", "code", "a", and etc tags. But …

Total answers: 1

How to use BeautifulSoup to find "Description" by div class_=css-gz8dae?

How to use BeautifulSoup to find "Description" by div class_=css-gz8dae? Question: I am new to Python that I am learning for scraping purposes. I am using BeautifulSoup to collect descriptions from job offers at: https://justjoin.it/offers/itds-net-fullstack-developer-angular On another site with job offers, using the same code with different div classes I can find what I need. …

Total answers: 2

How to print text and certain specified tags of XML file using BeautifulSoup

How to print text and certain specified tags of XML file using BeautifulSoup Question: I’m parsing the XML of a Microsoft Word .docx file with BeautifulSoup. I’d like to be able to extract the text of the XML file while still printing certain tags that I choose. I can get the text of the file …

Total answers: 1

Is there a really simple method for printing scraped output to a csv file?

Is there a really simple method for printing scraped output to a csv file? Question: Python: Python 3.11.2 Python Editor: PyCharm 2022.3.3 (Community Edition) – Build PC-223.8836.43 OS: Windows 11 Pro, 22H2, 22621.1413 Browser: Chrome 111.0.5563.65 (Official Build) (64-bit) I have a URL (e.g., https://dockets.justia.com/docket/puerto-rico/prdce/3:2023cv01127/175963) from which I’m scraping nine items. I’m looking to have …

Total answers: 3

Scraping content from what appear to be identical HTML elements

Scraping content from what appear to be identical HTML elements Question: Python: Python 3.11.2 Python Editor: PyCharm 2022.3.3 (Community Edition) – Build PC-223.8836.43 OS: Windows 11 Pro, 22H2, 22621.1413 Browser: Chrome 111.0.5563.65 (Official Build) (64-bit) I’m looking at the following URL — https://dockets.justia.com/docket/puerto-rico/prdce/3:2023cv01127/175963 — from which I’m attempting to scrape data from class elements that …

Total answers: 1

Python: Scrape href from td – can't get it to work correctly

Python: Scrape href from td – can't get it to work correctly Question: I’m very new to python and have gone through previous questions on SO but could not solve it. Here is my code: import requests import pandas as pd from bs4 import BeautifulSoup from urllib.parse import urlparse url = "https://en.wikipedia.org/wiki/List_of_curling_clubs_in_the_United_States" data = requests.get(url).text …

Total answers: 1

Parse a table from wikipedia that is hidden

Parse a table from wikipedia that is hidden Question: I’am pretty new here. I want to parse a table from wikipedia from a following link: https://en.wikipedia.org/wiki/MIUI I was able to parse first table, but I can’t figure out how to get the information from the second table there, the information that contains "version history" of …

Total answers: 1