-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what is the meaning of "respect.unordered.factors = "order"? #54
Comments
Although we would never want to treat unordered categorical variables as ordered for linear models, tree-based models are typically undeterred by this. |
Hi @A-Pai. Finding the optimal split in a decision tree for a categorical variable with J categories would require searching through 2^(J−1) − 1 potential splits. Fortunately, for binary classification and regression (at least when using the standard split rules, like Gini, entropy, or SSE) a shortcut exists that reduces the search to J - 1 possibilities (a massive reduction for large J). The shortcut essentially requires mean/target encoding the categorical in question prior to each split in each tree, which is what's described here for |
@A-Pai I'll add that for binary classification and regression, |
https://bradleyboehmke.github.io/HOML/random-forest.html
what is the meaning of "respect.unordered.factors = "order"?
The text was updated successfully, but these errors were encountered: