Start en-pl and en-fr training #62

kpu · 2022-03-15T22:38:32Z

These can get a leg up by using the existing pl-en (#61) and fr-en (#60) models that have already built their teacher systems, which should then be used for back translation. Note that forward translation for making a pl-en student is the same thing as backtranslation for making an en-pl model.

See mozilla/translations#46 on how to short-circuit this and talk to @eu9ene as necessary.

ugermann · 2022-03-16T13:15:06Z

Note that forward translation for making a pl-en student is the same thing as backtranslation for making an en-pl model.

Is it? Doesn't forward translation include the parallel data and backtranslation is typically only performed on monolingual data in the target language? In practise it may not make much of a difference, but I'm not sure they're exactly the same.

eu9ene · 2022-04-01T23:37:52Z

Hi, what's the status of the en-fr model training? At what stage in the pipeline is it now?

ugermann · 2022-04-02T00:07:07Z

Currently training first teacher fr-en, en-fr not startet yet (hoping to use the first teacher for backtranslation). There were numerous issues with the pipeline (party due to things getting out of sync between different branches and repositories) which sent me to jail without passing start go or collecting $200. We have over 600M sentence pairs for teacher training, so I expect things to be slow (first attempt went haywire, probably because the learning rate was too high). I've always been but am increasingly even more sceptical about the feasibility of a one-size-fits-all training procedure that works like a breadmaker (through in the ingredients, press a button and wake up the next morning to freshly baked bread / MT models). It's more like intensive care: needs constant monitoring of the patient and adjustments in response to signal readings.

2022-04-01 12:36:42] [valid] Ep. 1 : Up. 3000 : bleu-detok : 0.726982 : new best
[2022-04-01 14:29:42] [valid] Ep. 1 : Up. 6000 : bleu-detok : 0 : stalled 1 times (last best: 0.726982)
[2022-04-01 16:22:25] [valid] Ep. 1 : Up. 9000 : bleu-detok : 0 : stalled 2 times (last best: 0.726982)
[2022-04-01 18:13:56] [valid] Ep. 1 : Up. 12000 : bleu-detok : 1.3157 : new best
[2022-04-01 20:06:32] [valid] Ep. 1 : Up. 15000 : bleu-detok : 1.00074 : stalled 1 times (last best: 1.3157)
[2022-04-01 21:58:15] [valid] Ep. 1 : Up. 18000 : bleu-detok : 2.54976 : new best
[2022-04-01 23:47:29] [valid] Ep. 1 : Up. 21000 : bleu-detok : 3.16497 : new best

kpu assigned ugermann Mar 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start en-pl and en-fr training #62

Start en-pl and en-fr training #62

kpu commented Mar 15, 2022

ugermann commented Mar 16, 2022

eu9ene commented Apr 1, 2022

ugermann commented Apr 2, 2022

Start en-pl and en-fr training #62

Start en-pl and en-fr training #62

Comments

kpu commented Mar 15, 2022

ugermann commented Mar 16, 2022

eu9ene commented Apr 1, 2022

ugermann commented Apr 2, 2022