Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early stopping / validation set #57

Closed
sebffischer opened this issue Sep 20, 2022 · 4 comments
Closed

Early stopping / validation set #57

sebffischer opened this issue Sep 20, 2022 · 4 comments
Milestone

Comments

@sebffischer
Copy link
Member

We have already discussed this multiple times with Michel and Marc.
The question was which data to use for early stopping / validation, when conducting a resampling.

There were two options:

  1. The learner splits the training data it receives again into "actual training" and validation data
  2. The learner uses the test set in the task for the early stopping.

About 1.

  • (-) This can significantly reduce the training size just for early stopping.
  • (+) We still get an unbiased estimate

About 2.
This was made possible here mlr-org/mlr3@435c9d1, i.e. this PR ensures that the learners have access to the test set.

  • (+) more data is used for actually training the model
  • (-) the performance estimate might be slightly biased

An additional complication is what happens in an AutoTuner

In case 1. we now have to decide whether we still want to do early stopping when fitting the final model (probably not) or estimate the number of rounds from the obtained training iterations over all the folds (e.g. the maximum)

In case 2. we are forced to estimate the number of rounds, because no test set is available for the final model.

In principle we could let the user select between 1 and 2 by providing this as a parameter, i.e. when use_test_set = TRUE we use the test set for validation and if use_test_set = FALSE we split away data from the training set.

@sebffischer sebffischer added this to the 0.1 milestone Sep 20, 2022
@sebffischer
Copy link
Member Author

Currently learner_torch_train still assumes that the row_role early_stopping exists.

@sebffischer
Copy link
Member Author

Note also, that the parameter keep_last_prediction has to be be actually implemented, i.e. we (optionally) store the predictions from the last evaluation round (if done on the test set) so that we don't have to recompute them

@mb706
Copy link
Contributor

mb706 commented Sep 26, 2022

Two points:

  • using the test set for early stopping leaks information from the test set to the train set and gives a biased resampling result. It would probably be good to make it possible somehow (e.g. by making it possible that rows can have "testset" and "early stopping" role simultaneously), but it should probably not be the common case? An idea would be to have a pipeop that does splitting of train and test data (so that early stopping also works inside resampling folds, for example): convert task into X% train and (1-X%) early-stopping-data, add early stopping role to all test set rows, maybe other things...
  • It would be good to consider the interaction of mlr3pipelines with the concept of early-stopping-rows. How should e.g. imputation be handled? Should PipeOpImpute treat early-stopping-rows as training data, or as prediction data? What should class-balancing oversampling or smote do?

@mb706
Copy link
Contributor

mb706 commented Sep 26, 2022

I guess using the "test" split works fine with the proposed hyperparameters.

The mlr3pipelines-issue is now mlr-org/mlr3pipelines#698

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants