Failing at inserting a new line in existing xml file
Question:
I am currently writing a python code to manage an existing xml file.
I have this existing xml structure in a model file :
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>Z</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>MISSING</LAYER>
<NAME>MISSING</NAME>
<ID>MISSING</ID>
</SOURCE>
</PANEL1>
</MYPROJECT>
I need to be able to insert a new tag with its value in "Source" subelement. I want to get this in the end :
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>toto</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>A</LAYER>
<NAME>B</NAME>
<ID>C</ID>
<NEW_TAG>XX</NEW_TAG>
</SOURCE>
</PANEL1>
</MYPROJECT>
To get this, I use xml.etree.ElementTree and the insert function. Here is my code :
import xml.etree.ElementTree as ET
cheminXML = r"mypath"
def majXML (cheminXML):
tree = ET.parse(cheminXML)
tagProject = ET.Element("MYPROJECT")
tagPanel1 = ET.SubElement(tagProject,'PANEL1')
tagSource = ET.SubElement(tagPanel1,'SOURCE')
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
print(tagNewTag.text)
tagSource.insert(0,tagNewTag)
tagFilename = tree.find(".//FILENAME")
tagFilename.text = "toto"
tree.write(cheminXML)
if __name__ == "__main__":
majXML(cheminXML)
The new tag value is printed as it should.
When I open my xml file, the filename value is "toto".
However, the new tag and its value are not written in the file.
Why is that ?
Answers:
The following bit of your code is completely independent of the tree
you loaded:
tagProject = ET.Element("MYPROJECT")
tagPanel1 = ET.SubElement(tagProject,'PANEL1')
tagSource = ET.SubElement(tagPanel1,'SOURCE')
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
print(tagNewTag.text)
tagSource.insert(0,tagNewTag)
Basically, you just create a new ET.Element
from scratch, which you could write to a file with ET.ElementTree(tagProject).write('tagproject.xml')
.
On the other hand, you get tagFilename
from the actual tree
that you loaded, which means that you modify tree
.
The solution is to make the other edits also by taking the nodes from tree
rather than making new ones from scratch:
tagSource = tree.find('.//SOURCE') # tagSource is tree node, not a new one
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
As an alternative you can append the tag list, like:
import xml.etree.ElementTree as ET
tree = ET.parse("unknown.xml")
root = tree.getroot()
# redefine content
root.find(".//FILENAME").text = 'todo'
root.find(".//LAYER").text = 'A'
root.find(".//NAME").text = 'B'
root.find(".//ID").text = 'C'
# define and add a new element
new=ET.Element('NEW_TAG')
new.text = 'XX'
for elem in root.find('.//PANEL1'):
elem.append(new)
ET.dump(root)
tree.write("New_unknown.xml", encoding='utf-8', xml_declaration=True)
Output file:
<?xml version='1.0' encoding='utf-8'?>
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>todo</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>A</LAYER>
<NAME>B</NAME>
<ID>C</ID>
<NEW_TAG>XX</NEW_TAG>
</SOURCE>
</PANEL1>
</MYPROJECT>
I am currently writing a python code to manage an existing xml file.
I have this existing xml structure in a model file :
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>Z</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>MISSING</LAYER>
<NAME>MISSING</NAME>
<ID>MISSING</ID>
</SOURCE>
</PANEL1>
</MYPROJECT>
I need to be able to insert a new tag with its value in "Source" subelement. I want to get this in the end :
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>toto</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>A</LAYER>
<NAME>B</NAME>
<ID>C</ID>
<NEW_TAG>XX</NEW_TAG>
</SOURCE>
</PANEL1>
</MYPROJECT>
To get this, I use xml.etree.ElementTree and the insert function. Here is my code :
import xml.etree.ElementTree as ET
cheminXML = r"mypath"
def majXML (cheminXML):
tree = ET.parse(cheminXML)
tagProject = ET.Element("MYPROJECT")
tagPanel1 = ET.SubElement(tagProject,'PANEL1')
tagSource = ET.SubElement(tagPanel1,'SOURCE')
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
print(tagNewTag.text)
tagSource.insert(0,tagNewTag)
tagFilename = tree.find(".//FILENAME")
tagFilename.text = "toto"
tree.write(cheminXML)
if __name__ == "__main__":
majXML(cheminXML)
The new tag value is printed as it should.
When I open my xml file, the filename value is "toto".
However, the new tag and its value are not written in the file.
Why is that ?
The following bit of your code is completely independent of the tree
you loaded:
tagProject = ET.Element("MYPROJECT")
tagPanel1 = ET.SubElement(tagProject,'PANEL1')
tagSource = ET.SubElement(tagPanel1,'SOURCE')
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
print(tagNewTag.text)
tagSource.insert(0,tagNewTag)
Basically, you just create a new ET.Element
from scratch, which you could write to a file with ET.ElementTree(tagProject).write('tagproject.xml')
.
On the other hand, you get tagFilename
from the actual tree
that you loaded, which means that you modify tree
.
The solution is to make the other edits also by taking the nodes from tree
rather than making new ones from scratch:
tagSource = tree.find('.//SOURCE') # tagSource is tree node, not a new one
tagNewTag = ET.SubElement(tagSource, 'NEW_TAG')
tagNewTag.text = "apple"
As an alternative you can append the tag list, like:
import xml.etree.ElementTree as ET
tree = ET.parse("unknown.xml")
root = tree.getroot()
# redefine content
root.find(".//FILENAME").text = 'todo'
root.find(".//LAYER").text = 'A'
root.find(".//NAME").text = 'B'
root.find(".//ID").text = 'C'
# define and add a new element
new=ET.Element('NEW_TAG')
new.text = 'XX'
for elem in root.find('.//PANEL1'):
elem.append(new)
ET.dump(root)
tree.write("New_unknown.xml", encoding='utf-8', xml_declaration=True)
Output file:
<?xml version='1.0' encoding='utf-8'?>
<MYPROJECT>
<VERSION>2</VERSION>
<LANGUAGE>english</LANGUAGE>
<FILE>
<FILENAME>todo</FILENAME>
</FILE>
<PANEL1>
<SOURCE>
<LAYER>A</LAYER>
<NAME>B</NAME>
<ID>C</ID>
<NEW_TAG>XX</NEW_TAG>
</SOURCE>
</PANEL1>
</MYPROJECT>