Python BeautifulSoup cannot file text in webpage

Question:

I try to read text Hello World from website https://www.w3schools.com/python/default.asp by using BeautifulSoup with this code.

from bs4 import BeautifulSoup
import requests

url = "https://www.w3schools.com/python/default.asp"

res = requests.get(url)
res.encoding = "utf-8"
    
soup = BeautifulSoup(res.text, 'html.parser')
print(soup.prettify())

I print data from soup.prettify() and check data. it have no text Hello World. How to read text Hello World using BeautifulSoup?

Asked By: user572575

||

Answers:

There’re text contains "Hello World" but no text matching "Hello World" exactly. So use regex pattern

from bs4 import BeautifulSoup
import requests
import re

url = "https://www.w3schools.com/python/default.asp"

res = requests.get(url)
res.encoding = "utf-8"
    
soup = BeautifulSoup(res.text, 'html.parser')

hello_world = soup.find_all(string=re.compile('.*Hello.s*World.*'))

# If the element was found, print its text
if hello_world:
    print(hello_world)
else:
    print("Text not found")

output

['nprint("Hello, World!")n', 'Insert the missing part of the code below to output "Hello World".', '("Hello World")n']
Answered By: lex
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.