- Create conda environment:
conda create --name curriculum_nmt python=3.7
- Install requirements in
requirements.txt
- Run
bash run_iwslt.sh download
to download the IWSLT dataset - Run
bash run_iwslt.sh vocab
to generate vocab files. This generates aiwslt_vocab.json
andiwslt_word_freq.json
-
Train the model locally on IWSLT with
bash run_iwslt.sh train_local
(with "none" ordering) -
Train the model with desired scoring and pacing functions locally on IWSLT e.g.
bash run_iwslt.sh train_local rarity linear
(with "rarity" ordering and "linear" pacing. seescoring.py
andpacing.py
for more options)
- Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation arXiv
- On The Power of Curriculum Learning in Training Deep Networks arXiv code
- Competence-based Curriculum Learning for Neural Machine Translation arXiv
- Improving Neural Machine Translation Models with Monolingual Data arXiv