Python How to get the tag value without the tag xml

Question:

  • I am using findAll method of BeautifulSoup and trying to fetch all the values of particular tag DocumentIndex.
  • While using it, I am getting the output as
[<DocumentIndex>3646</DocumentIndex>, <DocumentIndex>3650</DocumentIndex>, <DocumentIndex>3649</DocumentIndex>]
  • Code, gstr_xml is available here
lstr_soup = BeautifulSoup(gstr_xml, features="xml")
lstr_folder_index = lstr_soup.findAll('DocumentIndex')
print(lstr_folder_index)
  • How can I get the output just as
[3646, 3650, 3649]
Asked By: donny

||

Answers:

Each value in the list is a <class 'bs4.element.Tag'>, which you can call .text on to retrieve just the text value.

print([x.text for x in lstr_folder_index])

# Output:
['3646', '3650', '3649']
Answered By: BeRT2me
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.