Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitoring training progress #6

Open
jelmervdl opened this issue Mar 2, 2023 · 1 comment
Open

monitoring training progress #6

jelmervdl opened this issue Mar 2, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@jelmervdl
Copy link
Contributor

jelmervdl commented Mar 2, 2023

There's already a tensorboard-marian connector. We can either plug into that or write our own version of it. We have the added benefit of having direct access to marian's stdout and stderr so we can just read directly from there.

Regular expressions: https://github.com/marian-nmt/marian-tensorboard/blob/b9867c43472a27783611accba93adebda60ba462/src/marian_tensorboard/marian_tensorboard.py#L107-L125

Added benefit of doing the integration ourselves: we can also push dataset events to tensorboard, like epoch events and training stages.

Slightly related to #3.

@jelmervdl jelmervdl added the enhancement New feature or request label Mar 2, 2023
@XapaJIaMnu
Copy link
Contributor

Add to this issue:

Advance to a new state when marian reports stall in a validation set.

This can be used to automatically find the optimal point to transition between stages combined with resetting the optimizer inside marian so that our new dataset mixture doesn't get its gradients penalised too hard from the change of data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants