Python error "not well-formed (invalid token)"

Question:

I have some software that outputs an XML file that I am trying to read with python, so I can get the results and add them into my database.

import xml.etree.ElementTree as etree
with open('E:/uk_bets_history.xml', 'r') as xml_file:
    xml_tree = etree.parse(xml_file)

I am getting the error “xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 1” but unsure why it is not formatted correctly. I am not in control of how the file is created as this is done by some other software I own.

The example xml is here: http://jarrattperkins.com/uk_bets_history

Asked By: Jarratt Perkins

||

Answers:

File you’ve provided as example use UTF-8 with BOM encoding, so you need to use open() with encoding argument:

open("FILE_PATH", encoding="utf-8-sig")
Answered By: Olvin Roght
with open(file=file_path, mode='r', encoding='utf-8-sig') as xml_txt:                 
  root = ET.fromstring((xml_txt.read().encode('utf-8')), ET.XMLParser(encoding='utf-8'))

This works for me. Thanks Jules_96

Answered By: Onur E.A
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.