-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducability of old runs (Legacy Server?) #12
Comments
At this moment, I think we can only really promise option 3 (which is fair, I think). Option 1, running many legacy servers won't be maintainable. For Option 2, simple dockerfiles won't work. We'll need to containerize (using Docker or Singularity) the entire environment, including compiled binaries. That would make sense if we have a regular release cycle of the entire server, and that would really make more sense after we've moved everything to Python? Then we can auto-generate these containers with every major release, and link runs to an OpenML release, but this is not low-hanging fruit. Thoughts? |
As I understand it, aren't we very close to option 2? @josvandervelde dockerized all server services. So all we would need to do is to tag current versions. Then, combined with older openml-python docker images, we mostly make good on the promise. At least as far as scikit-learn based runs go. |
Maybe. We would also need to store compiled containers, right? With wheels breaking and all that. |
The images (should) come with fully pre-installed environments. I am not entirely sure what you are referring to. |
Then that should be fine. If there are multiple version of dependent libraries (eg. Scikitlearn, torch,...) between OpenML releases, would that still work? |
Gotcha. That's indeed an issue. Likely packages will stay available on PyPI. But we can build an images for (sklearn x openml) tuples (or other dependencies), to cover it a bit better. I am not sure where we should draw the line in that case. |
Goal:
For reproducibility, it would be great if we could recompute any run. For instance, a run with an old
scikit-learn
version.Challenges
This is a complex problem. It might, for instance, be the case that the scikit-learn wheel is not available anymore, or not for more modern python versions. Old versions might also have serious security vulnerabilities
Options
The text was updated successfully, but these errors were encountered: