-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with datasets with categorical attributes #1228
Comments
Could you please provide us with the output of
so we know the versions of scikit-learn and OpenML-Python? |
|
Sidenote: I noticed that task 32 is not actually credit-g (opened as separate issue #1229). |
Hereby the version info:
This is indeed the error I know that it is preferred to do OneHotEncoding, but in the past it worked also without (or, for example, when using first imputation and then hotencoding, this error occurs). |
There are also examples which work with categorical data, e.g., this pipeline from the docs, is it possible you mixed them up? As far as I am aware, Example for running a pipeline on kr-vs-kp: import openml
from sklearn import pipeline, compose, preprocessing, impute, ensemble, tree
# OpenML helper functions for sklearn can be plugged in directly for complicated pipelines
from openml.extensions.sklearn import cat, cont
openml.config.start_using_configuration_for_example()
task = openml.tasks.get_task(7)
pipe = pipeline.Pipeline(
steps=[
(
"Preprocessing",
compose.ColumnTransformer(
[
(
"categorical",
preprocessing.OneHotEncoder(sparse=False, handle_unknown="ignore"),
cat, # returns the categorical feature indices
),
(
"continuous",
impute.SimpleImputer(strategy="median"),
cont,
), # returns the numeric feature indices
]
),
),
("Classifier", tree.DecisionTreeClassifier()),
]
)
run = openml.runs.run_model_on_task(pipe, task, avoid_duplicate_runs=False) |
I think @PGijsbers's statement and code are a potential solution to this issue. |
the following code crashes when applying on datasets with categorical attributes (comes from the examples)
@mfeurer @prabhant @PGijsbers
The text was updated successfully, but these errors were encountered: