-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start en-pl and en-fr training #62
Comments
Is it? Doesn't forward translation include the parallel data and backtranslation is typically only performed on monolingual data in the target language? In practise it may not make much of a difference, but I'm not sure they're exactly the same. |
Hi, what's the status of the en-fr model training? At what stage in the pipeline is it now? |
Currently training first teacher fr-en, en-fr not startet yet (hoping to use the first teacher for backtranslation). There were numerous issues with the pipeline (party due to things getting out of sync between different branches and repositories) which sent me to jail without passing start go or collecting $200. We have over 600M sentence pairs for teacher training, so I expect things to be slow (first attempt went haywire, probably because the learning rate was too high). I've always been but am increasingly even more sceptical about the feasibility of a one-size-fits-all training procedure that works like a breadmaker (through in the ingredients, press a button and wake up the next morning to freshly baked bread / MT models). It's more like intensive care: needs constant monitoring of the patient and adjustments in response to signal readings.
|
These can get a leg up by using the existing pl-en (#61) and fr-en (#60) models that have already built their teacher systems, which should then be used for back translation. Note that forward translation for making a pl-en student is the same thing as backtranslation for making an en-pl model.
See mozilla/translations#46 on how to short-circuit this and talk to @eu9ene as necessary.
The text was updated successfully, but these errors were encountered: