Python not able to save some CLOB xml into xml file

Question:

I have this xml saved as CLOB in my oracle database:

<?xml version="1.0" encoding="UTF-8"?>
<DCResponse>
...
</DCResponse>

and with this python code I am able to save content into xml file:

sql = "select extract(xmltype.createxml(xml), '//DCResponse').getStringVal() from table t where id = 2"
for row in cursor.execute(sql):
    print(row[0])
with open("output.xml", "w") as f:
    f.write(row[0])

Instead with this xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<PIPEDocument  xsi_schemaLocation="urn:XML-PIPE 
PIPEDocument.xsd" ReferenceNumber="567862650" CreationDate="20200115155255" Version="1.0"  
>
...
</PIPEDocument>

I’m not able to extract the content with python. Write() argument must be str, not None…..is result in Python console running this code:

sql = "select extract(xmltype.createxml(xml), '//PIPEDocument').getStringVal() from table t where id 
= 7"
for row in cursor.execute(sql):
    print(row[0])
with open("output.xml", "w") as f:
    f.write(row[0])

In my oracle client the output of the below sql query, used in python, is null:

select extract(xmltype.createxml(xml), '//PIPEDocument').getStringVal() from table t where id = 7;

while the xml content is present in my DB:

select xml from table where id =7

Not sure what is the issue, maybe the keyword ‘//PIPEDocument’ in the select query or different encoding between the 2 XML files, but no idea how to fix this.

Please Help
Best Regards
Giancarlo

Answers:

The problem is that in your second XML document you have namespaces. The element <PIPEDocument> is in the namespace , and your XPath expression //PIPEDocument' only matches elements <PIPEDocument> in the ‘default’ namespace.

If you want to use namespaces with the extract function you have to add:

  • a namespace mapping of a prefix to the namespace URI, using the optional third argument to extract. This third argument is a string formatted in the same way as won’t work here. Instead, use a prefix, such as p, or perhaps pipe.

  • add the prefix to the XPath expression in the second argument to extract.

I made these changes to your code, choosing to use the namespace prefix p. I also wrapped the entire SQL string in triple-quotes to avoid me having to escape the quotes inside it. This gave me the following, which returned the desired XML output:

sql = """select extract(xmltype.createxml(xml), '//p:PIPEDocument', '').getStringVal() from table t where id = 7"""
for row in cursor.execute(sql):
    print(row[0])
Answered By: Luke Woodward

For query performance, you might want to fetch the LOB as a String, see the doc.

For general reference, if the database type had been XMLType, cx_Oracle’s recommendation is to use xmltype.getclobval() to avoid limited XML lengths.

Update: The latest version of cx_Oracle, now called python-oracledb has a Thin mode that doesn’t have the XML query length limitation.

Answered By: Christopher Jones
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.