ElementTree XPath – Select Element based on attribute
Question:
I am having trouble using the attribute XPath Selector in ElementTree, which I should be able to do according to the Documentation
Here’s some sample code
XML
<root>
<target name="1">
<a></a>
<b></b>
</target>
<target name="2">
<a></a>
<b></b>
</target>
</root>
Python
def parse(document):
root = et.parse(document)
for target in root.findall("//target[@name='a']"):
print target._children
I am receiving the following Exception:
expected path separator ([)
Answers:
The syntax you’re trying to use is new in ElementTree 1.3.
Such version is shipped with Python 2.7 or higher.
If you have Python 2.6 or less you still have ElementTree 1.2.6 or less.
There are several problems in this code.
-
Python’s buildin ElementTree (ET for short) has no real XPATH support; only a limited subset By example, it doesn’t support find-from-root expressions like //target
.
Notice: the documentation
mentions “//“, but only for children: So an expression as
.//target
is valid; //...
is not!
There is an alternative implementation: lxml which is more rich. It’s seams that documentation is used, for the build-in code. That does not match/work.
-
The @name
notation selects xml-attributes; the key=value
expression within an xml-tag.
So that name-value has to be 1 or 2 to select something in the given document. Or, one can search for targets with a child element ‘a’: target[a]
(no @).
For the given document, parsed with the build-in ElementTree (v1.3) to root, the following code is correct and working:
root.findall(".//target")
Find both targets
root.findall(".//target/a")
Find two a-element
root.findall(".//target[a]")
This finds both target-element again, as both have an a-element
root.findall(".//target[@name='1']")
Find only the first target. Notice the quotes around 1 are needed; else a SyntaxError is raised
root.findall(".//target[a][@name='1']")
Also valid; to find that target
root.findall(".//target[@name='1']/a")
Finds only one a-element; …
I am having trouble using the attribute XPath Selector in ElementTree, which I should be able to do according to the Documentation
Here’s some sample code
XML
<root>
<target name="1">
<a></a>
<b></b>
</target>
<target name="2">
<a></a>
<b></b>
</target>
</root>
Python
def parse(document):
root = et.parse(document)
for target in root.findall("//target[@name='a']"):
print target._children
I am receiving the following Exception:
expected path separator ([)
The syntax you’re trying to use is new in ElementTree 1.3.
Such version is shipped with Python 2.7 or higher.
If you have Python 2.6 or less you still have ElementTree 1.2.6 or less.
There are several problems in this code.
-
Python’s buildin ElementTree (ET for short) has no real XPATH support; only a limited subset By example, it doesn’t support find-from-root expressions like
//target
.Notice: the documentation
mentions “//“, but only for children: So an expression as
.//target
is valid;//...
is not!There is an alternative implementation: lxml which is more rich. It’s seams that documentation is used, for the build-in code. That does not match/work.
-
The
@name
notation selects xml-attributes; thekey=value
expression within an xml-tag.So that name-value has to be 1 or 2 to select something in the given document. Or, one can search for targets with a child element ‘a’:
target[a]
(no @).
For the given document, parsed with the build-in ElementTree (v1.3) to root, the following code is correct and working:
root.findall(".//target")
Find both targetsroot.findall(".//target/a")
Find two a-elementroot.findall(".//target[a]")
This finds both target-element again, as both have an a-elementroot.findall(".//target[@name='1']")
Find only the first target. Notice the quotes around 1 are needed; else a SyntaxError is raisedroot.findall(".//target[a][@name='1']")
Also valid; to find that targetroot.findall(".//target[@name='1']/a")
Finds only one a-element; …