Python BeautifulSoup cannot read data from div tag

Question:

I try to read data from this div tag from website.

<div class="Bgc($lv2BgColor) Bxz(bb) Ovx(a) Pos(r) Maw($newGridWidth) Miw($minGridWidth) Miw(a)!--tab768 Miw(a)!--tab1024 Mstart(a) Mend(a) Px(20px) Py(10px) D(n)--print">

enter image description here

from bs4 import BeautifulSoup
import requests
import re
from urllib.request import urlopen

url = "https://finance.yahoo.com/"

urlpage=urlopen(url).read()
bswebpage=BeautifulSoup(urlpage)

t = bswebpage.find_all("div",{'class':"Bgc($lv2BgColor) Bxz(bb) Ovx(a) Pos(r) Maw($newGridWidth) Miw($minGridWidth) Miw(a)!--tab768 Miw(a)!--tab1024 Mstart(a) Mend(a) Px(20px) Py(10px) D(n)--print"})


print(t)

I use findall with BeautifulSoup but output not show anything. It show only this

[]

How to fix it?

Asked By: user572575

||

Answers:

You could get the parent of that div instead, since it has an id, which is unique by design. Then, since that div has just one kid, the element you’re looking for, its as simple as getting the element’s kid:

t = bswebpage.find("div",{'id': 'Lead-3-FinanceHeader-Proxy'}).div
print(t)
Answered By: TopchetoEU

It’s mostlikely that the urlopen isn’t working properly here and element selection may be a little bit incorrect way. However, the below solution is working fine.

from bs4 import BeautifulSoup
import requests
url = "https://finance.yahoo.com/"
res = requests.get(url)
#print(res)
bswebpage=BeautifulSoup(res.text,'lxml')
t = [x.get_text(' ',strip=True) for x in bswebpage.select('div[class="Carousel-Mask Pos(r) Ov(h) market-summary M(0) Pos(r) Ov(h) D(ib) Va(t)"] > ul > li h3')]
print(t)

Output:

['S&P 500 4,085.17 -32.69 (-0.79%)', 'Dow 30 33,706.91 -242.10 (-0.71%)', 'Nasdaq 11,799.67 -110.85 (-0.93%)', 'Russell 2000 1,918.40 -24.20 (-1.25%)', 'Crude Oil 77.79 -0.68 (-0.87%)', 'Gold 1,873.10 -17.60 (-0.93%)']
Answered By: Md. Fazlul Hoque