Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle missing in one hot encoder #400

Open
PaulWestenthanner opened this issue Mar 12, 2023 · 3 comments
Open

Handle missing in one hot encoder #400

PaulWestenthanner opened this issue Mar 12, 2023 · 3 comments

Comments

@PaulWestenthanner
Copy link
Collaborator

Expected Behavior

Currently, handle_missing=value adds a new column although the documentation says 'value' will encode a new value as 0 in every dummy column.
Furthermore, we need a test for this

Actual Behavior

adds a column instead of using all 0

Steps to Reproduce the Problem

from category_encoders import OneHotEncoder
import pandas as pd

he = OneHotEncoder(handle_missing="value")

data = [("foo", 1), ("bar", 2), (None, 6)]
data = pd.DataFrame(data, columns=["c1", "c2"])
print(he.fit_transform(data))

Specifications

  • Version: 2.6
  • Platform: linux
@bmreiniger
Copy link
Contributor

Would this replace the new "ignore" from #396?

I would expect this to be the correct behavior; is the added column a longstanding behavior, or perhaps a regression that wasn't caught in testing?

@PaulWestenthanner
Copy link
Collaborator Author

PaulWestenthanner commented Mar 24, 2023

Oh you're right. I missed this when adding the ignore option. Thanks for pointing out.
not sure about the naming though... we have the option value to put in "some value that makes sense" in most encoders. So it makes sense for people familiar with the library, ignore on the other hand is more telling

@lazarust
Copy link

@PaulWestenthanner I can take this if no one else has!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants