beautifulsoup, how to get text ignoring elements
Question:
it is possible to filter out only the text from the following structure:
"""<font>
<em>X</em>
and
<em>Y</em>
</font>"""
to obtain the following output:
output = "X and Y"
Answers:
Try:
from bs4 import BeautifulSoup
html_doc = """
<font>
<em>X</em>
and
<em>Y</em>
</font>"""
soup = BeautifulSoup(html_doc, "html.parser")
out = soup.find("font").get_text(strip=True, separator=" ")
print(out)
Prints:
X and Y
it is possible to filter out only the text from the following structure:
"""<font>
<em>X</em>
and
<em>Y</em>
</font>"""
to obtain the following output:
output = "X and Y"
Try:
from bs4 import BeautifulSoup
html_doc = """
<font>
<em>X</em>
and
<em>Y</em>
</font>"""
soup = BeautifulSoup(html_doc, "html.parser")
out = soup.find("font").get_text(strip=True, separator=" ")
print(out)
Prints:
X and Y