(Python) AttributeError: 'NoneType' object has no attribute 'text'

Question:

I’m getting the error below when I’m parsing the xml from the URL in the code. I won’t post the XML because it’s huge. The link is in the code below.

ERROR:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-70-77e5e1b79ccc> in <module>()
     11 
     12 for child in root.iter('Materia'):
---> 13     if not child.find('EmentaMateria').text is None:
     14             ementa = child.find('EmentaMateria').text
     15 

AttributeError: 'NoneType' object has no attribute 'text'

MY CODE:

url = 'http://legis.senado.leg.br/dadosabertos/senador/4988/autorias'
import requests
from xml.etree import ElementTree

response = requests.get(url, stream=True)
response.raw.decode_content = True

tree = ElementTree.parse(response.raw)

root = tree.getroot()

for child in root.iter('Materia'):
    if child.find('EmentaMateria').text is not None:
            ementa = child.find('EmentaMateria').text

    for child_IdMateria in child.findall('IdentificacaoMateria'):
        anoMateria = child_IdMateria.find('AnoMateria').text
        materia = child_IdMateria.find('NumeroMateria').text
        siglaMateria = child_IdMateria.find('SiglaSubtipoMateria').text



    print('Ano = '+anoMateria+' | Numero Materia = '+materia+' | tipo = '+siglaMateria+' | '+ementa)

What I’m overlooking here?
Thanks

Asked By: grc

||

Answers:

Instead of checking if child.find('EmentaMateria').text is not None, you should make sure that child.find('EmentaMateria') is not None first.

Also, you should store the returning value of child.find('EmentaMateria') to avoid calling it twice.

Lastly, you should assign ementa a default value if child.find('EmentaMateria') is None; otherwise your print function below will be referencing an un-initialized variable.

Change:

if child.find('EmentaMateria').text is not None:
    ementa = child.find('EmentaMateria').text

to:

node = child.find('EmentaMateria')
if node is not None:
    ementa = node.text
else:
    ementa = None

Alternatively, you can use the built-in function getattr to do the same without a temporary variable:

ementa = getattr(child.find('EmentaMateria'), 'text', None)
Answered By: blhsing

If you are using the code to parse an xml file, open the xml file with a text editor and inspect the tags. In my case there were some rogue tags at the end. Once i removed those, the code worked as expected.

Answered By: creisdorf