[bug] XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 #2419

fhjgch · 2024-10-15T15:30:53Z

Bug report checklis

Searched the issues page for similar reports
Read the relevant sections of the documentation
Browse the tutorials and tests for usefull code snippets and examples of use
Reproduced the issue after updating with pip install --upgrade pandapower (or git pull)
Tried basic troubleshooting (if a bug/error) like restarting the interpreter and checking the pythonpath

Reproducible Example

See `cim2pp` notebook in tutorials:

# folder_path points to the directory where the CIM .zip-Files are stored:
folder_path = os.path.join(os.getcwd(), 'example_cim')

# cgmes_files is a list containing paths to both files needed for the CIM converter:
cgmes_files = [os.path.join(folder_path, 'CGMES_v2.4.15_SmallGridTestConfiguration_Boundary_v3.0.0.zip'),
               os.path.join(folder_path, 'CGMES_v2.4.15_SmallGridTestConfiguration_BaseCase_Complete_v3.0.0.zip')]

for f in cgmes_files:
    if not os.path.exists(f):
        raise UserWarning(f"Wrong path specified for the CGMES file {f}")

net = cim2pp.from_cim(file_list=cgmes_files, use_GL_or_DL_profile='DL')

print('Conversion successful')

Issue Description and Traceback

Reading of the XML files fails with following error: XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1.
This only happens to some of the files stored in example_cim, not all of them.

It seems that the encoding of the files is the problem, as some are encoded in 'utf-8' while others in 'utf-8-bom'. After conversion of all files into 'utf-8' the file import was successful.

Expected Behavior

Message: "Conversion successful"

Installed Versions

INSTALLED VERSIONS

commit : 0691c5cf90477d3503834d983f69350f250a6ff7
python : 3.11.8
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 186 Stepping 3, GenuineIntel
byteorder : little
LOCALE : English_United Kingdom.1252

pandas : 2.2.3
numpy : 2.0.2
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 23.2.1
IPython : 8.28.0
bs4 : 4.12.3
jinja2 : 3.1.4
lxml.etree : 5.3.0
matplotlib : 3.9.2
numba : 0.60.0
scipy : 1.13.1
tzdata : 2024.2

Label

Relevant labels are selected

The text was updated successfully, but these errors were encountered:

KS-HTK · 2024-10-31T11:46:51Z

@heckstrahler @mrifraunhofer I had the same Issue.

KS-HTK · 2024-11-19T10:59:30Z

The issue seems to be the encoding passed to XMLParser object. According to help(etree.XMLParser) this should be a libiconv encoding name, suggesting that 'UTF-8' is a valid name. But if i ommit the encoding keyword there is no longer any issue.

What is the reason for overriding the encoding?

Relavant code section: cim_classes.py Line 488

# Leads to error
parser = etree.XMLParser(encoding='UTF-8', resolve_entities=False)
xml_tree = etree.parse(file, parser)

# No error
parser = etree.XMLParser(encoding=None, resolve_entities=False)
xml_tree = etree.parse(file, parser)

print(xml_tree.docinfo.encoding)
# prints: 'UTF-8'

fhjgch added the bug label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 #2419

[bug] XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 #2419

fhjgch commented Oct 15, 2024

KS-HTK commented Oct 31, 2024 •

edited

Loading

KS-HTK commented Nov 19, 2024

[bug] XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 #2419

[bug] XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 #2419

Comments

fhjgch commented Oct 15, 2024

Bug report checklis

Reproducible Example

Issue Description and Traceback

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Label

KS-HTK commented Oct 31, 2024 • edited Loading

KS-HTK commented Nov 19, 2024

KS-HTK commented Oct 31, 2024 •

edited

Loading