-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to add unsupported language? (nob
)
#18
Comments
Hi, unfortunately I don't think the authors are planning to release other MMS models. |
Hi! I have tried running your scripts, and converting a checkpoint from https://huggingface.co/facebook/mms-tts-swe, with the premise that Swedish and Norwegian is quite similar. I have played around with different learning rates, and parameters, but I consistently get infinity for KL loss, and NaN loss after 100 steps or so.... If you could give pointers for training a model from scratch, I could give it a shot. 😊 |
How did you initialize the model ? This might have an important role. Also which hyper-parameters did you use ? I'll recommend using the default one from the Vits original training. |
No, I generated that one from the swedish model, with I used the hyperparameters you provided in https://github.com/ylacombe/finetune-hf-vits/tree/main/training_config_examples as basis, but did a "random manual search" from there. Where can I find the default ones from original training? |
In that case, here is a snippet that you can modify to initialize from scratch: from utils.configuration_vits import VitsConfig
from utils.modeling_vits_training import VitsModelForPreTraining
from utils.feature_extraction_vits import VitsFeatureExtractor
from transformers import AutoTokenizer
NEW_REPO_ID = ...
config = VitsConfig.from_pretrained("thomasht86/mms-tts-nob")
VitsModelForPreTraining(config).push_to_hub(NEW_REPO_ID)
VitsFeatureExtractor.from_pretrained("thomasht86/mms-tts-nob").push_to_hub(NEW_REPO_ID)
AutoTokenizer.from_pretrained("thomasht86/mms-tts-nob").push_to_hub(NEW_REPO_ID) In terms of training, I'd advice:
|
I am in sort of the same situation but looking to finetune MMS for danish (which is very similar to norwegian). I am having trouble understanding where the above code snippet fits into the training pipeline. Should it be executed after converting a checkpoint using the |
Strangely enough, I can see from MMS coverage
that
nob
(Norwegian) is not supported for TTS.What must be done in order to support it?
The text was updated successfully, but these errors were encountered: