Python XPath SyntaxError: invalid predicate


i am trying to parse an xml like







and here is my code

import xml.etree.ElementTree as ET

tree = ET.parse("../../xml/test.xml")

root = tree.getroot()


print root.findall(path)

but i get an error

print root.findall(path)
  File "", line 390, in findall
    return ElementPath.findall(self, path, namespaces)
  File "", line 293, in findall
    return list(iterfind(elem, path, namespaces))
  File "", line 263, in iterfind
    selector.append(ops[token[0]](next, token))
  File "", line 224, in prepare_predicate
    raise SyntaxError("invalid predicate")
SyntaxError: invalid predicate

what is wrong with my xpath?

Follow up

Thanks falsetru, your solution worked. I have a follow up. Now, i want to get all the paragraph elements that come before the paragraph with text GHF. So in this case i only need the XBV element. I want to ignore the ash and lplp. i guess one way to do this would be

result = []
for para in root.findall('./pages/page/'):
    t = para.text.encode("utf-8", "ignore")
    if t == "GHF":

but is there a better way to do this?

Asked By: AbtPst



ElementTree’s XPath support is limited. Use other library like lxml:

import lxml.etree
root = lxml.etree.parse('test.xml')

path = "./pages/page/paragraph[text()='GHF']"
Answered By: falsetru

As @falsetru mentioned, ElementTree doesn’t support text() predicate, but it supports matching child element by text, so in this example, it is possible to search for a page that has a paragraph with specific text, using the path ./pages/page[paragraph='GHF']. The problem here is that there are multiple paragraph tags in a page, so one would have to iterate for the specific paragraph. In my case, I needed to find the version of a dependency in a maven pom.xml, and there is only a single version child so the following worked:

In [1]: import xml.etree.ElementTree as ET

In [2] ns = {"pom": ""}

In [3] print ET.parse("pom.xml").findall(".//pom:dependencies/pom:dependency[pom:artifactId='some-artifact-with-hardcoded-version']/pom:version", ns)[0].text
Out[1]: '1.2.3'
Answered By: haridsv
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.