Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edit _transform_count_encode #261

Merged
merged 2 commits into from
Oct 6, 2021

Conversation

pyrrhull
Copy link
Contributor

@pyrrhull pyrrhull commented Jul 20, 2020

moved X.fillna... inside for loop so that it handles only columns
present in self.cols individually and not the whole dataframe
addition to #260

moved X.fillna... inside for loop so that it handles only columns
present in self.cols individually and not the whole dataframe
@pyrrhull pyrrhull mentioned this pull request Jul 20, 2020
@janmotl
Copy link
Collaborator

janmotl commented Jul 22, 2020

Looks good to me. Consider adding a unit test to make sure that the issue does not return (personally, I would have preffered a parametrized test in test_encoders to cover all the encoders).

@pyrrhull
Copy link
Contributor Author

I am considering the tests but don't know how to proceed. What should the tests be against and which parameters to use? Isn't test_metamorphic already doing the job here?

@janmotl
Copy link
Collaborator

janmotl commented Jul 23, 2020

Metamorphic tests encoded features. But we also have to test non-encoded features. The test to add:

    def test_ignored_columns_are_untouched(self):
        # Make sure None values in ignored columns are preserved.
        # See: https://github.com/scikit-learn-contrib/category_encoders/pull/261
        X = pd.DataFrame({'col1': ['A', 'B', None], 'col2': ['C', 'D', None]})
        y = [1, 0, 1]

        for encoder_name in (set(encoders.__all__)):
            with self.subTest(encoder_name=encoder_name):
                enc = getattr(encoders, encoder_name)(cols=['col1'])
                out = enc.fit_transform(X, y)
                self.assertTrue(out.col2[2] is None)

@pyrrhull
Copy link
Contributor Author

Great, so I added your test to test_encoders.py.

@wdm0006
Copy link
Collaborator

wdm0006 commented Oct 6, 2021

Working through some old PRs, this looks good. If you'd be interested in working to maintain the project more actively please check out #248, I'm (as you probably noticed) not able to consistently.

@wdm0006 wdm0006 merged commit d8df67c into scikit-learn-contrib:master Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants