beautifulsoup

BS Extract all text between two specified keyword

BS Extract all text between two specified keyword Question: With Python and BS i need to extract all text contained between two specified word blabla text i need blibli I succeed to extract inside DIV and TAG but not for specific and different keyword. Thank you for your help Asked By: steve figueras || Source …

Total answers: 4

Remove all text from a html node using regex

Remove all text from a html node using regex Question: Is it possible to remove all text from HTML nodes with a regex? This very simple case seems to work just fine: import htmlmin html = """ <li class="menu-item"> <p class="menu-item__heading">Totopos</p> <p>Chips and molcajete salsa</p> <p class="menu-item__details menu-item__details–price"> <strong> <span class="menu-item__currency"> $ </span> 4 </strong> …

Total answers: 2

When using BeautifulSoup to scrape Ebay, I get the wrong price for a listing

When using BeautifulSoup to scrape Ebay, I get the wrong price for a listing Question: I am using BeautifulSoup to scrape the eBay site, however I got a different price than from the actual eBay page. I want to search for multiple different products later on and get the price for each of them, that’s …

Total answers: 1

How to extract links from a website in python?

How to extract links from a website in python? Question: I am trying to webscrape the below website. As a first step, I would like to get the links from which to extract the text. However, when I do the following, I get an empty list: import pandas as pd from bs4 import BeautifulSoup url …

Total answers: 2

Cannot locate text within using Python

Cannot locate text within using Python Question: Hi all, I am scraping questions on Amazon using the following code: url = "https://www.amazon.com/ask/questions/asin/B0000CFLYJ/1/ref=ask_ql_psf_ql_hza?isAnswered=true" r = requests.get("http://localhost:8050/render.html", params = {‘url’: url, ‘wait’: 3}) soup = BeautifulSoup(r.text, ‘html.parser’) questions = soup.find_all(‘div’, {‘class’:’a-fixed-left-grid-col a-col-right’}) print(questions) question_list = [] for item in questions: question = item.find(‘a’,{‘class’:’a-link-normal’}).text.strip() question_list.append(question) But I keep …

Total answers: 2

Attempting to download a .csv via a link on within a page w/ python

Attempting to download a .csv via a link on within a page w/ python Question: I’m attempting to download and use data contained in a .csv file. I’m pretty amatuer at scraping or most things coding related and would appreciate any help. Is it possible I’m being blocked from pulling the link from the website? …

Total answers: 1

How to programmatically getting link to CSV behind javascript page?

How to programmatically getting link to CSV behind javascript page? Question: I’m using python and I’m trying to get the link from which the CSV come from when I click on the DATA V CSV button at the bottom of this page. I tried beautifulsoup: import requests from bs4 import BeautifulSoup url = ‘https://www.ceps.cz/en/all-data#AktualniSystemovaOdchylkaCR’ response …

Total answers: 2

Why can't beautifulsoup detect this table from this website?

Why can't beautifulsoup detect this table from this website? Question: I tried to webscrape the table from this website "https://racing.hkjc.com/racing/information/English/Jockey/JockeyRanking.aspx" onto an excel sheet with beautifulsoup and pandas. This is my code. from bs4 import BeautifulSoup import pandas as pd # Send a GET request to the URL url = "https://racing.hkjc.com/racing/information/English/Jockey/JockeyRanking.aspx" response = requests.get(url) # …

Total answers: 2

Accessing tabbed elements with BeautifulSoup

Accessing tabbed elements with BeautifulSoup Question: I want to scrape all the currency tabs under Currency pairs on this site: https://markets.businessinsider.com/currencies. However I can only get the currency pairs that are active by default when I load the page which is US-Dollar. How can I access the rest of the items with Beautifulsoup? Using the …

Total answers: 1

Python BeautifulSoup issue in extracting direct text in a given html tag

Python BeautifulSoup issue in extracting direct text in a given html tag Question: I am trying to extract direct text in a given HTML tag. Simply, for <p> Hello! </p>, the direct text is Hello!. The code works well except with the case below. from bs4 import BeautifulSoup soup = BeautifulSoup(‘<div> <i> </i> FF Services …

Total answers: 2