parse xml with specific declaration

Question:

Hi I have attempting to parse xml using this link:
https://docs.python.org/3/library/xml.etree.elementtree.html

however, when i attempt to follow it, I am getting this issue

>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('sitemap.xml')
>>> root = tree.getroot()
>>> print(root['loc'])
element indices must be integers

I am attempting to parse the loc value from this sitemap.xml declaration:

<root>
  <url>
    <loc>HTTPS://website.com/</loc>
    <lastmod>2022-10-10</lastmod>
  </url>
  <url>
    <loc>https://website.com/search/</loc>
    <lastmod>2022-10-10</lastmod>
  </url>
  <url>
    <loc>https://website.com/auth/user/</loc>
</root>

UPDATE
I can get it to print out a single loc value via:
print(root[0][0].text)

however, I want to loop through all of these loc fields and print them out – how can i do so?

Asked By: Jshee

||

Answers:

tree = ET.parse('sitemap.xml')
root = tree.getroot()
for url in root:
    for child in url:
        if child.tag == 'loc':
            print(child.text)

Note that this will only work if the xml is in the exact format you provided (e.g. all of the loc tags are direct children of the url tags).

Answered By: Spartan299
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.