Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
iwslt14.tokenized.de-en.tar.gz		iwslt14.tokenized.de-en.tar.gz

README.md

Training a new model

IWSLT'14 German to English (Transformer)

The following instructions can train a Transformer model on the IWSLT'14 German to English dataset.

Step 1: Prepare the training data:

We provide the BPE code for better reproducibility. The source and target vocabulary are shared with 10,000 merges.

# Extract the data
cd sample/train/
IWSLT_PATH=iwslt14.tokenized.de-en
tar -zxvf $IWSLT_PATH.tar.gz
IWSLT_PATH=sample/train/$IWSLT_PATH

# Binarize the data
cd ../..
python3 tools/GetVocab.py \
  -raw $IWSLT_PATH/bpevocab \
  -new $IWSLT_PATH/vocab.de
python3 tools/GetVocab.py \
  -raw $IWSLT_PATH/bpevocab \
  -new $IWSLT_PATH/vocab.en
python3 tools/PrepareParallelData.py \
  -src $IWSLT_PATH/train.de -tgt $IWSLT_PATH/train.en \
  -src_vocab $IWSLT_PATH/vocab.de -tgt_vocab $IWSLT_PATH/vocab.en \
  -output $IWSLT_PATH/train.data
python3 tools/PrepareParallelData.py \
  -src $IWSLT_PATH/valid.de -tgt $IWSLT_PATH/valid.en \
  -src_vocab $IWSLT_PATH/vocab.de -tgt_vocab $IWSLT_PATH/vocab.en \
  -output $IWSLT_PATH/valid.data

You may extract the data manually on Windows.

Step 2: Train the model with default configurations (6 encoder/decoder layer, 512 model size, 50 epoches):

bin/NiuTrans.NMT \
  -dev 0 \
  -nepoch 50 \
  -model model.bin \
  -maxcheckpoint 10 \
  -train $IWSLT_PATH/train.data \
  -valid $IWSLT_PATH/valid.data

Step 3: Average the last ten checkpoints:

python tools/Ensemble.py -input 'model.bin.*' -output model.ensemble

It costs about 310s per epoch on a GTX 1080 Ti.

Expected BLEU score (lenalpha=0.6, maxlenalpha=1.2):

Model type	Beam Search	Greedy Search
Single model	34.05 (beam=4)	33.35
Ensemble model	34.48 (beam=4)	34.01

We provide models trained with the default configurations:

Google Drive

Baidu Cloud (password: bdwp)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train

train

README.md

Training a new model

IWSLT'14 German to English (Transformer)

Files

train

Directory actions

More options

Directory actions

More options

Latest commit

History

train

Folders and files

parent directory

README.md

Training a new model

IWSLT'14 German to English (Transformer)