Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: publishing run using documentation example throws Arff error #1366

Open
josvandervelde opened this issue Oct 15, 2024 · 2 comments
Open

Comments

@josvandervelde
Copy link

josvandervelde commented Oct 15, 2024

Description

Running the example code found at https://openml.github.io/openml-python/main/ throws an error.

Steps/Code to Reproduce

import openml
from sklearn import impute, tree, pipeline

clf = pipeline.Pipeline(
    steps=[
        ('imputer', impute.SimpleImputer()),
        ('estimator', tree.DecisionTreeClassifier())
    ]
)
task = openml.tasks.get_task(32)
run = openml.runs.run_model_on_task(clf, task)
run.publish()

I'm running this using https://github.com/openml/services/.

Expected Results

No error is thrown, run is uploaded.

Actual Results

openml.exceptions.OpenMLServerException: http://nginx:80/api/v1/xml/run/ returned code 209: Error parsing uploaded file. - Arff error in predictions file: invalid value for nominal attribute: 1 (l.19) 

If you look at the the predictions.arff, it's indeed wrong:

[...]
@ATTRIBUTE repeat NUMERIC
@ATTRIBUTE fold NUMERIC
@ATTRIBUTE sample NUMERIC
@ATTRIBUTE row_id NUMERIC
@ATTRIBUTE prediction {tested_negative, tested_positive}
@ATTRIBUTE correct {tested_negative, tested_positive}
@ATTRIBUTE confidence.tested_negative NUMERIC
@ATTRIBUTE confidence.tested_positive NUMERIC

@DATA
0,0,0,53,1,1,0,0
[...]

Versions

Linux-6.8.0-45-generic-x86_64-with-glibc2.36
Python 3.10.15 (main, Sep 27 2024, 06:07:24) [GCC 12.2.0]
NumPy 2.1.1
SciPy 1.14.1
Scikit-Learn 1.5.2
OpenML 0.15.0
@LennartPurucker
Copy link
Contributor

I cannot reproduce this issue. It might have been a result of the last update to xmltodict that is now reversed (see #1125).

Can you try again? If it still does not work, ensure that xmltodict != 0.14.1

@PGijsbers
Copy link
Collaborator

I am not sure what caused it, but it shouldn't have to do with xmltodict, as it is a mismatch between data and header in the arff file (the data has encoded labels and the header has the original string labels).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants