Splittable random numbers for reproducible training #259

bfolie · 2021-12-17T21:02:26Z

Bagger and MultiTaskBagger both train the individual models in parallel. Because the order of training is uncontrolled, this means that Lolo random forests are inherently non-reproducible, even if the bagging and the rngs for base learners are identical.

There are ways of guaranteeing reproducibility across multiple threads, and we should make use of them.
SplittableRandom in Java
A discussion in the context of numpy

iterateccvoelker · 2022-09-08T15:39:02Z

Hi, how is it going? Is there any update on the issue? Thank you so much for a brief message in advance! Best, Christoph

bfolie · 2022-09-16T19:35:22Z

Thanks for asking @BAMcvoelker . To be honest we hadn't thought about it in a while, but after seeing your comment we realized we have all of the tools and just need to thread them through.

We open sourced our splittable random number library, which means it's available to pull into Lolo. I will pull it in soon and use it to make bagged training reproducible.

iterateccvoelker · 2022-09-20T19:58:05Z

Thank you so much @bfolie for the update and for picking up the topic again. I look forward to the update!

This was referenced Dec 17, 2021

Let users pass an RNG object to nondeterministic functions #206

Closed

Not able to reproduce results. #257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splittable random numbers for reproducible training #259

Splittable random numbers for reproducible training #259

bfolie commented Dec 17, 2021 •

edited

Loading

iterateccvoelker commented Sep 8, 2022

bfolie commented Sep 16, 2022

iterateccvoelker commented Sep 20, 2022

Splittable random numbers for reproducible training #259

Splittable random numbers for reproducible training #259

Comments

bfolie commented Dec 17, 2021 • edited Loading

iterateccvoelker commented Sep 8, 2022

bfolie commented Sep 16, 2022

iterateccvoelker commented Sep 20, 2022

bfolie commented Dec 17, 2021 •

edited

Loading