Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start en-pl and en-fr training #62

Open
kpu opened this issue Mar 15, 2022 · 3 comments
Open

Start en-pl and en-fr training #62

kpu opened this issue Mar 15, 2022 · 3 comments
Assignees

Comments

@kpu
Copy link
Member

kpu commented Mar 15, 2022

These can get a leg up by using the existing pl-en (#61) and fr-en (#60) models that have already built their teacher systems, which should then be used for back translation. Note that forward translation for making a pl-en student is the same thing as backtranslation for making an en-pl model.

See mozilla/translations#46 on how to short-circuit this and talk to @eu9ene as necessary.

@ugermann
Copy link
Member

Note that forward translation for making a pl-en student is the same thing as backtranslation for making an en-pl model.

Is it? Doesn't forward translation include the parallel data and backtranslation is typically only performed on monolingual data in the target language? In practise it may not make much of a difference, but I'm not sure they're exactly the same.

@eu9ene
Copy link

eu9ene commented Apr 1, 2022

Hi, what's the status of the en-fr model training? At what stage in the pipeline is it now?

@ugermann
Copy link
Member

ugermann commented Apr 2, 2022

Currently training first teacher fr-en, en-fr not startet yet (hoping to use the first teacher for backtranslation). There were numerous issues with the pipeline (party due to things getting out of sync between different branches and repositories) which sent me to jail without passing start go or collecting $200. We have over 600M sentence pairs for teacher training, so I expect things to be slow (first attempt went haywire, probably because the learning rate was too high). I've always been but am increasingly even more sceptical about the feasibility of a one-size-fits-all training procedure that works like a breadmaker (through in the ingredients, press a button and wake up the next morning to freshly baked bread / MT models). It's more like intensive care: needs constant monitoring of the patient and adjustments in response to signal readings.

2022-04-01 12:36:42] [valid] Ep. 1 : Up. 3000 : bleu-detok : 0.726982 : new best
[2022-04-01 14:29:42] [valid] Ep. 1 : Up. 6000 : bleu-detok : 0 : stalled 1 times (last best: 0.726982)
[2022-04-01 16:22:25] [valid] Ep. 1 : Up. 9000 : bleu-detok : 0 : stalled 2 times (last best: 0.726982)
[2022-04-01 18:13:56] [valid] Ep. 1 : Up. 12000 : bleu-detok : 1.3157 : new best
[2022-04-01 20:06:32] [valid] Ep. 1 : Up. 15000 : bleu-detok : 1.00074 : stalled 1 times (last best: 1.3157)
[2022-04-01 21:58:15] [valid] Ep. 1 : Up. 18000 : bleu-detok : 2.54976 : new best
[2022-04-01 23:47:29] [valid] Ep. 1 : Up. 21000 : bleu-detok : 3.16497 : new best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants