beautiful soup just get the value inside the tag

Question:

The following command:

volume = soup.findAll("span", {"id": "volume"})[0]

gives:

<span class="gr_text1" id="volume">16,103.3</span>

when I issue a print(volume).

How do I get just the number?

Asked By: user1357015

||

Answers:

Extract the string from the element:

volume = soup.findAll("span", {"id": "volume"})[0].string
Answered By: isedev

Using css selector:

>>> soup.select('span#volume')[0].text
u'16,103.3'
Answered By: falsetru
Answered By: cinv3

Just to add , I also found the .string dosn’t do well when there is <br> in the text.

EG:

 <div class = "Lines">
    <span> First Line <br> Second Line <br> Third Line </span>
  </div>

If we do a soup.find("div",attrs={"class":"Lines}).span.string we get a None

But a soup.find("div",attrs={"class":"Lines}).span.text we get

First Line
Second Line
Third Line

I think the .string gives a NavigatableString object and .text gives a unicode object.

Answered By: Sanjay

There is a function for getting the value of the tag : tag.contents[0]

Try this :

volumes = soup('span')
for volume in volumes:
     print(volume.contents[0])
Answered By: Rohit Soni
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.