-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early stopping / validation set #57
Comments
Currently |
Note also, that the parameter |
Two points:
|
I guess using the The mlr3pipelines-issue is now mlr-org/mlr3pipelines#698 |
We have already discussed this multiple times with Michel and Marc.
The question was which data to use for early stopping / validation, when conducting a resampling.
There were two options:
About 1.
About 2.
This was made possible here mlr-org/mlr3@435c9d1, i.e. this PR ensures that the learners have access to the test set.
An additional complication is what happens in an
AutoTuner
In case 1. we now have to decide whether we still want to do early stopping when fitting the final model (probably not) or estimate the number of rounds from the obtained training iterations over all the folds (e.g. the maximum)
In case 2. we are forced to estimate the number of rounds, because no test set is available for the final model.
In principle we could let the user select between 1 and 2 by providing this as a parameter, i.e. when
use_test_set = TRUE
we use the test set for validation and ifuse_test_set = FALSE
we split away data from the training set.The text was updated successfully, but these errors were encountered: