Python BeautifulSoup cannot file text in webpage
Question:
I try to read text Hello World from website https://www.w3schools.com/python/default.asp by using BeautifulSoup with this code.
from bs4 import BeautifulSoup
import requests
url = "https://www.w3schools.com/python/default.asp"
res = requests.get(url)
res.encoding = "utf-8"
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.prettify())
I print data from soup.prettify() and check data. it have no text Hello World. How to read text Hello World using BeautifulSoup?
Answers:
There’re text contains "Hello World" but no text matching "Hello World" exactly. So use regex pattern
from bs4 import BeautifulSoup
import requests
import re
url = "https://www.w3schools.com/python/default.asp"
res = requests.get(url)
res.encoding = "utf-8"
soup = BeautifulSoup(res.text, 'html.parser')
hello_world = soup.find_all(string=re.compile('.*Hello.s*World.*'))
# If the element was found, print its text
if hello_world:
print(hello_world)
else:
print("Text not found")
output
['nprint("Hello, World!")n', 'Insert the missing part of the code below to output "Hello World".', '("Hello World")n']
I try to read text Hello World from website https://www.w3schools.com/python/default.asp by using BeautifulSoup with this code.
from bs4 import BeautifulSoup
import requests
url = "https://www.w3schools.com/python/default.asp"
res = requests.get(url)
res.encoding = "utf-8"
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.prettify())
I print data from soup.prettify() and check data. it have no text Hello World. How to read text Hello World using BeautifulSoup?
There’re text contains "Hello World" but no text matching "Hello World" exactly. So use regex pattern
from bs4 import BeautifulSoup
import requests
import re
url = "https://www.w3schools.com/python/default.asp"
res = requests.get(url)
res.encoding = "utf-8"
soup = BeautifulSoup(res.text, 'html.parser')
hello_world = soup.find_all(string=re.compile('.*Hello.s*World.*'))
# If the element was found, print its text
if hello_world:
print(hello_world)
else:
print("Text not found")
output
['nprint("Hello, World!")n', 'Insert the missing part of the code below to output "Hello World".', '("Hello World")n']