BeautifulSoup parent tag

Question:

I have some html that I want to extract text from. Here’s an example of the html:

<p>TEXT I WANT <i> &#8211; </i></p>

Now, there are, obviously, lots of <p> tags in this document. So, find('p') is not a good way to get at the text I want to extract. However, that <i> tag is the only one in the document. So, I thought I could just find the <i> and then go to the parent.

I’ve tried:

up = soup.select('p i').parent

and

up = soup.select('i')
print(up.parent)

and I’ve tried it with .parents, I’ve tried find_all('i'), find('i')… But I always get:

'list' object has no attribute "parent"

What am I doing wrong?

Asked By: porteclefs

||

Answers:

find_all() returns a list. find('i') returns the first matching element, or None.

The same applies to select() (returns a list) and select_one() (first match or None).

Thus, use:

try:
    up = soup.find('i').parent
except AttributeError:
    # no <i> element

Demo:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<p>TEXT I WANT <i> &#8211; </i></p>')
>>> soup.find('i').parent
<p>TEXT I WANT <i> – </i></p>
>>> soup.find('i').parent.text
u'TEXT I WANT  u2013 '
Answered By: Martijn Pieters

Both select() and find_all() return you an array of elements. You should do like follow:

for el in soup.select('i'):
    print el.parent.text
Answered By: amaslenn

This works:

i_tag = soup.find('i')
my_text = str(i_tag.previousSibling).strip()

output:

'TEXT I WANT'

As mentioned in other answers, find_all() returns a list, whereas find() returns the first match or None

If you are unsure about the presence of an i tag you could simply use a try/except block

Answered By: Totem

soup.select() returns a Python List. So you have ‘unlist’ the variable
e.g.:

>>> [up] = soup.select('i')
>>> print(up.parent)

or

>>> up = soup.select('i')
>>> print(up[0].parent)
Answered By: Chad Frederick

I think you are actually looking in a group of these kind of tags.The select function actually returns list of mentioned tags so if you are asking for the parent tag,it doesn’t know which member of the list do you mean.Try

    up = soup.select('p i')[0].parent
    print(up)

this will tell that you are actually looking for the parentof first one in the list (‘[0]’).I don’t know this will work just try it out.

Answered By: noobintech