Extracting data from an XLIFF file and creating a data frame
Question:
I have an XLIFF file with the following structure.
<?xml version="1.0" encoding="UTF-8"?>
<xliff >
TAG
SOURCE
TARGET
Title
Source text
Target text
Description
Source text
Target text
Summary
Source text
Target text
Relevant
Source text
Target text
From area code
Source text
Target text
I tried building a data frame with all tags and text using the following code, so then I could filter the rows that contain the data I need.
import xml.etree.ElementTree as ET
tree=ET.parse('583197.xliff')
root=tree.getroot()
# print(root)
store_items = []
all_items = []
for elem in tree.iter():
tag=elem.tag()
attri = elem.attrib()
text = elem.text()
store_items = [attri,text]
all_items.append(store_items)
xmlToDf = pd.DataFrame(all_items, columns=[
'Attri','Text'])
print(xmlToDf.to_string(index=False))
How can I extract specific tags, attributes, and text from an XLIFF file so I can build a data frame?
I have an XLIFF file with the following structure.
<?xml version="1.0" encoding="UTF-8"?>
<xliff >
TAG
SOURCE
TARGET
Title
Source text
Target text
Description
Source text
Target text
Summary
Source text
Target text
Relevant
Source text
Target text
From area code
Source text
Target text
I tried building a data frame with all tags and text using the following code, so then I could filter the rows that contain the data I need.
import xml.etree.ElementTree as ET
tree=ET.parse('583197.xliff')
root=tree.getroot()
# print(root)
store_items = []
all_items = []
for elem in tree.iter():
tag=elem.tag()
attri = elem.attrib()
text = elem.text()
store_items = [attri,text]
all_items.append(store_items)
xmlToDf = pd.DataFrame(all_items, columns=[
'Attri','Text'])
print(xmlToDf.to_string(index=False))
How can I extract specific tags, attributes, and text from an XLIFF file so I can build a data frame?