Collect data from an XML file with elementTree

Question:

I’m developing a script with python that should read some data from an XML file and store it in a list of objects, so they can be treated later. When executing the script, it should collect the data and print the name of the clients and their respective IDs on the screen, but it prints nothing. Before you ask, yes, the XML file is being accessed correctly, all modules are working as they should, and the code does not display any alerts or errors when executed. Below is the script and the initial snippet of the parsed XML file. Thank you to everyone who can contribute in some way.

import queryServer as QS
import xml.etree.ElementTree as ET


class Client:
   def __init__(self, name, id):
      self.name = name
      self.id = id


def clientData(): 
   queryServerURL, apiKey = QS.getQueryServerAccessCredentials()

   url = f'{queryServerURL}api/?apikey={apiKey}&service=list_clients'
   root = QS.connectToQueryServer(url)

   clientsList = []

   for clientName in root.iterfind(f'./items/client/name'):
      for clientID in root.iterfind(f'./items/client[name="{clientName}"]/clientid'):
         clientsList.append(Client(clientName.text, clientID.text))

   for obj in clientsList: 
      print(obj.name, obj.id, sep=' ')
      # --> Here it should print all customers name and their respective IDs, but it prints nothing

   
if __name__ == '__main__':
   clientData()

Initial snippet of the parsed XML file:

<result created="2022-09-29T08:50:26-05:00" host="www.systemmonitor.us" status="OK">
   <items>
      <client>
         <name>
            <![CDATA[ Censored ]]>
         </name>
         <clientid> Cendored </clientid>
         <view_dashboard>0</view_dashboard>
         <view_wkstsn_assets>0</view_wkstsn_assets>
         <dashboard_username>
            <![CDATA[ Censored ]]>
         </dashboard_username>
         <timezone/>
         <creation_date>2015-10-20</creation_date>
         <server_count>1</server_count>
         <workstation_count>2</workstation_count>
         <mobile_device_count>0</mobile_device_count>
         <device_count>3</device_count>
      </client>
      <client>
         <name>
            <![CDATA[ Censored ]]>
         </name>
         <clientid> Censored </clientid>
         <view_dashboard>0</view_dashboard>
         <view_wkstsn_assets>0</view_wkstsn_assets>
         <dashboard_username>
            <![CDATA[ Censored ]]>
         </dashboard_username>
         <timezone/>
         <creation_date>2019-11-21</creation_date>
         <server_count>1</server_count>
         <workstation_count>0</workstation_count>
         <mobile_device_count>0</mobile_device_count>
         <device_count>1</device_count>
      </client>
Asked By: Hugo Marotta

||

Answers:

Try changing this

for clientName in root.iterfind(f'./items/client/name'):
  for clientID in root.iterfind(f'./items/client[name="{clientName}"]/clientid'):
     clientsList.append(Client(clientName.text, clientID.text))

to:

for clientName in root.findall('.//items//client'):
    clientsList.append(Client(clientName.find('name').text, clientName.find('clientid').text))

and see if it works with your actual xml.

Answered By: Jack Fleeting

Simply add the .text attribute to your string formatting on second loop since you need the underlying text and not Element object. Also, you may want to remove whitespace with str.strip():

def clientData():
   root = ET.fromstring(xml)

   clientsList = []

   for clientName in root.iterfind('./items/client/name'):
      for clientID in root.iterfind(f'./items/client[name="{clientName.text}"]/clientid'):
         clientsList.append(Client(clientName.text.strip(), clientID.text.strip()))

   for obj in clientsList: 
      print(obj.name, obj.id, sep=' ')

clientData()

Also, consider list comprehension avoid the bookkeeping of initializing a list and iteratively appending:

clientsList = [
    Client(clientName.text.strip(), clientID.text.strip())
    for clientName in root.iterfind('./items/client/name')
    for clientID in root.iterfind(f'./items/client[name="{clientName.text}"]/clientid')
]

Actually, you only need one for call, especially if duplicate names appear such as posted example:

clientsList = [
    Client(client.findtext('name').strip(), client.findtext('clientid').strip())
    for client in root.iterfind('./items/client')
]
Answered By: Parfait
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.