xml.parsers.expat.ExpatError on parsing XML
Question:
I am trying to parse XML with Python but not getting very far. I think it’s due to wrong XML tree this API returns.
So this is what is returned by the GET request:
<codigo>3</codigo><valor></valor><operador>Dummy</operador>
The GET request goes here:
http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX
This is the Python code I am using without any luck:
import urllib
from xml.dom import minidom
url = urllib.urlopen('http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX')
xml = minidom.parse(url)
code = doc.getElementsByTagName('codigo')
print code[0].data
And this is the response I get:
xml.parsers.expat.ExpatError: junk after document element: line 1, column 18
What I need to do is retrieve the value inside the <codigo>
element and place it in a variable (same for the others).
Answers:
An XML document consists of one top level document element, and then multiple subelements. Your XML fragment contains multiple top level elements, which is not permitted by the XML standard.
Try returning something like:
<result><codigo>3</codigo><valor></valor><operador>Dummy</operador></result>
I have wrapped the entire response in a <result>
tag.
The main problem here is that the XML code being returned by that service doesn’t include a root node, which is invalid. I fixed this by simply wrapping the output in a <root>
node.
import urllib
from xml.etree import ElementTree
url = 'http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX'
xmldata = '<root>' + urllib.urlopen(url).read() + '</root>'
tree = ElementTree.fromstring(xmldata)
codigo = tree.find('codigo').text
print codigo
You can use whatever parser you wish, but here I used ElementTree to get the value.
This error occurs when the xml data doesn’t come as expected in the format.
In my case, it happened because the api’s token expired and it was giving xml data that cannot be parsed.
So I suggest, check your data and see if the data is in correct format or not.
import urllib.request
from xml.etree import ElementTree
with urllib.request.urlopen("<your URL>") as url:
xmldata = '<root>' + str(url.read()) + '</root>'
tree = ElementTree.fromstring(xmldata)
codigo = tree.find('codigo').text
For an explanation, see the original Python 2 version of this answer:
https://stackoverflow.com/a/1140753/2745495
I am trying to parse XML with Python but not getting very far. I think it’s due to wrong XML tree this API returns.
So this is what is returned by the GET request:
<codigo>3</codigo><valor></valor><operador>Dummy</operador>
The GET request goes here:
http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX
This is the Python code I am using without any luck:
import urllib
from xml.dom import minidom
url = urllib.urlopen('http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX')
xml = minidom.parse(url)
code = doc.getElementsByTagName('codigo')
print code[0].data
And this is the response I get:
xml.parsers.expat.ExpatError: junk after document element: line 1, column 18
What I need to do is retrieve the value inside the <codigo>
element and place it in a variable (same for the others).
An XML document consists of one top level document element, and then multiple subelements. Your XML fragment contains multiple top level elements, which is not permitted by the XML standard.
Try returning something like:
<result><codigo>3</codigo><valor></valor><operador>Dummy</operador></result>
I have wrapped the entire response in a <result>
tag.
The main problem here is that the XML code being returned by that service doesn’t include a root node, which is invalid. I fixed this by simply wrapping the output in a <root>
node.
import urllib
from xml.etree import ElementTree
url = 'http://69.36.9.147:8090/clientes/SMS_API_OUT.jsp?codigo=ABCDEFGH&cliente=XX'
xmldata = '<root>' + urllib.urlopen(url).read() + '</root>'
tree = ElementTree.fromstring(xmldata)
codigo = tree.find('codigo').text
print codigo
You can use whatever parser you wish, but here I used ElementTree to get the value.
This error occurs when the xml data doesn’t come as expected in the format.
In my case, it happened because the api’s token expired and it was giving xml data that cannot be parsed.
So I suggest, check your data and see if the data is in correct format or not.
import urllib.request
from xml.etree import ElementTree
with urllib.request.urlopen("<your URL>") as url:
xmldata = '<root>' + str(url.read()) + '</root>'
tree = ElementTree.fromstring(xmldata)
codigo = tree.find('codigo').text
For an explanation, see the original Python 2 version of this answer:
https://stackoverflow.com/a/1140753/2745495