Emitting namespace specifications with ElementTree in Python

Question:

I am trying to emit an XML file with element-tree that contains an XML declaration and namespaces. Here is my sample code:

from xml.etree import ElementTree as ET
ET.register_namespace('com',"http://www.company.com") #some name

# build a tree structure
root = ET.Element("STUFF")
body = ET.SubElement(root, "MORE_STUFF")
body.text = "STUFF EVERYWHERE!"

# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)

tree.write("page.xml",
           xml_declaration=True,
           method="xml" )

However, neither the <?xml tag comes out nor any namespace/prefix information. I’m more than a little confused here.

Asked By: Paul Nathan

||

Answers:

I’ve never been able to get the <?xml tag out of the element tree libraries programatically so I’d suggest you try something like this.

from xml.etree import ElementTree as ET
root = ET.Element("STUFF")
root.set('com','http://www.company.com')
body = ET.SubElement(root, "MORE_STUFF")
body.text = "STUFF EVERYWHERE!"

f = open('page.xml', 'w')
f.write('<?xml version="1.0" encoding="UTF-8"?>' + ET.tostring(root))
f.close()

Non std lib python ElementTree implementations may have different ways to specify namespaces, so if you decide to move to lxml, the way you declare those will be different.

Answered By: Philip Southam

Although the docs say otherwise, I only was able to get an <?xml> declaration by specifying both the xml_declaration and the encoding.

You have to declare nodes in the namespace you’ve registered to get the namespace on the nodes in the file. Here’s a fixed version of your code:

from xml.etree import ElementTree as ET
ET.register_namespace('com',"http://www.company.com") #some name

# build a tree structure
root = ET.Element("{http://www.company.com}STUFF")
body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF")
body.text = "STUFF EVERYWHERE!"

# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)

tree.write("page.xml",
           xml_declaration=True,encoding='utf-8',
           method="xml")

Output (page.xml)

<?xml version='1.0' encoding='utf-8'?><com:STUFF ><com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF></com:STUFF>

ElementTree doesn’t pretty-print either. Here’s pretty-printed output:

<?xml version='1.0' encoding='utf-8'?>
<com:STUFF >
    <com:MORE_STUFF>STUFF EVERYWHERE!</com:MORE_STUFF>
</com:STUFF>

You can also declare a default namespace and don’t need to register one:

from xml.etree import ElementTree as ET

# build a tree structure
root = ET.Element("{http://www.company.com}STUFF")
body = ET.SubElement(root, "{http://www.company.com}MORE_STUFF")
body.text = "STUFF EVERYWHERE!"

# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)

tree.write("page.xml",
           xml_declaration=True,encoding='utf-8',
           method="xml",default_namespace='http://www.company.com')

Output (pretty-print spacing is mine)

<?xml version='1.0' encoding='utf-8'?>
<STUFF >
    <MORE_STUFF>STUFF EVERYWHERE!</MORE_STUFF>
</STUFF>
Answered By: Mark Tolonen
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.