-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Language specific validation sets to deepspeed #1
Comments
I can review/implement this part. |
My current understanding is that in Evaluation is then done here, both for We could set and call
Some modification to |
Currently the code base yields 1 single validation / test sets. There’s no support of adding args for the specifications of the multiple validation datasets. my adhoc solution is to add an extra argument:
The idea here is to allow mixing different validation sets on the fly
any thoughts about a better design? |
work in progress PR sent here: bigscience-workshop/Megatron-DeepSpeed#97 |
The idea of this issue to modify the megatron-deepspeed repository code that we use for training all models. In order to track the progress of validation loss on several validaiton sets separately. This would allow us to track the progress of training independtly on separate languages.
Currently, the validation loss is calculated on a single validation set that includes the same language combination as the training data. (see here 13B param model training on tensorboard)
Useful pointers
Progress
The text was updated successfully, but these errors were encountered: