How to print line number of element if it equals specific format given in one XML searched in another XML

Question:

Example of my XML1 consists of format of Equal element :

<Operator>
    <Equal>
        <Data>OPERATING_MODE</Data>
        <Data>2</Data>
    </Equal>
    <Equal>
        <Integer>Remote_Request</Integer>
        <Data>2</Data>
    </Equal>
    <Equal>
        <Integer>Real_area</Integer>
        <Integer>2</Integer>
    </Equal>
</Operator>

And the following code runs fine for finding elements only in Sytem1 (for small XML) and prints label

doc = etree.parse('C:/Python/Sample.xml')
doc2 = etree.parse('C:/Python/Project.xml')
values = [e.xpath('.//*[2]')[0].text for e in doc.xpath('.//Equal')]

for service in doc2.xpath('.//System1//Data[Label]'):
    value = service.xpath('.//Equal//*[2]/text()')[0]
    if value in values:
        #get the value in the corresponding Label
        print(service.xpath('.//Label/text()')[0])

But my XML has some 100000+ lines and is a bit complex with many elements and child elements in it, small example of XML2 is as follows:

<File>
    <System1>
        <Messages>
            <Setting_Report>
                <Data>
                    <Label>A1</Label>
                    <Bit_count>1</Bit_count>
                    <Value>
                        <Equal>
                            <Data>Data</Data>
                            <Data>2</Data>
                        </Equal>
                    </Value>
                </Data>
                <Data>
                    <Label>A2</Label>
                    <Bit_count>1</Bit_count>
                    <Value>
                        <Value>
                            <Equal>
                                <Data>Data</Data>
                                <Data>2</Data>
                            </Equal>
                        </Value>
                    </Value>
                </Data>
                <Data>
                    <Label>A3</Label>
                    <Bit_count>1</Bit_count>
                    <Value>
                        <Data>Data</Data>
                    </Value>
                </Data>
                <Data>
                    <Label>A4</Label>
                    <Bit_count>1</Bit_count>
                </Data>
                <Data>
                    <Label>A5</Label>
                    <Bit_count>1</Bit_count>
                </Data>
                <Data>
                    <Label>A35</Label>
                    <Bit_count>1</Bit_count>
                    <Value>
                        <Value>
                            <Equal>
                                <Data>Data</Data>
                                <Data>2</Data>
                            </Equal>
                        </Value>
                    </Value>
                </Data>
            </Setting_Report>
            <Status_Report>
                <Data>
                    <Label>Real_area_1</Label>
                    <Bit_count>8</Bit_count>
                    <Value>
                        <Equal>
                            <Data>Yes</Data>
                            <Data>2</Data>
                        </Equal>
                    </Value>
                </Data>
                <Data>
                    <Label>Real_area_2</Label>
                    <Bit_count>8</Bit_count>
                    <Value>
                        <Value_on_condition>
                            <Case>
                                <Value>
                                    <Integer>1</Integer>
                                </Value>
                                <Condition>
                                    <Equal>
                                        <Integer>No</Integer>
                                        <Integer>2</Integer>
                                    </Equal>
                                </Condition>
                            </Case>
                            <Case>
                                <Value>
                                    <Data>Order</Data>
                                </Value>
                                <Condition/>
                            </Case>
                        </Value_on_condition>
                    </Value>
                </Data>
            </Status_Report>
        </Messages>
    </System1>
    <System2>
        <Basic>
            <Data>
                <!--A1-->
                <Label>Area_1</Label>
                <Direction>Out</Direction>
                <Bit_count>1</Bit_count>
                <Value>
                    <Default_value_if_undefined>
                        <Value>
                            <!---->
                            <Value_on_boolean>
                                <And>
                                    <Equal>
                                        <Data>area_id</Data>
                                        <Data>2</Data>
                                    </Equal>
                                    <Data>Redundant</Data>
                                </And>
                                <Value_if_true>
                                    <Integer>1</Integer>
                                </Value_if_true>
                                <Value_if_false>
                                    <Integer>0</Integer>
                                </Value_if_false>
                            </Value_on_boolean>
                        </Value>
                        <Default_value>
                            <Integer>0</Integer>
                        </Default_value>
                    </Default_value_if_undefined>
                </Value>
            </Data>
        </Basic>
    </System2>
</File>

But running the above code throws error as follows and prints only starting two labels:

A1
A2
Traceback (most recent call last):
  File "C:SyntaxSyntax.py", line 11, in <module>
    value = service.xpath('.//Equal//*[2]/text()')[0]
IndexError: list index out of range

And anyways I only want to print line number of ‘Equal’ element present in XML2 if format of ‘Equal’ (having child element as two Data or two Integer or child element first is Integer and then Data) is as given in XML1.

So that i can find out where all the format of ‘Equal’ is same in XML2
Here in XML2 following lines i want to print due to same formats:10,21,48,61,77,108,137,171

Always grateful for your help.

PS: sorry for giving such long XML but otherwise it would have become difficult for me to explain my question

Asked By: Anonymous

||

Answers:

The xpath on the for-loop can be changed to find Data elements with a Label child AND an Equal child with 2 children
//System1//Data[Label and .//Equal//*[2]]

for service in doc.xpath('//System1//Data[Label and .//Equal//*[2]]'):
    value = service.xpath('.//Equal//*[2]')
    if value[0].text in values:
        # print Label and Equal's second child line number
        print(service.xpath('.//Label/text()')[0], f"lineno: {value[0].sourceline}")

Result

A1 lineno: 11
A2 lineno: 22
A35 lineno: 49
Real_area_1 lineno: 62
Real_area_2 lineno: 78

Also works

>>> for service in doc.xpath('.//System1//Equal[ancestor::Data[Label]]/*[2]'):
...     if service.text in values:
...         print(service.xpath('.//ancestor::Data/Label/text()'), f"lineno: {service.sourceline}")
... 
['A1'] lineno: 11
['A2'] lineno: 22
['A35'] lineno: 49
['Real_area_1'] lineno: 62
['Real_area_2'] lineno: 78
Answered By: LMC
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.