Skip to content

Commit

Permalink
final push
Browse files Browse the repository at this point in the history
  • Loading branch information
thevasudevgupta committed Apr 9, 2021
1 parent 58b9e37 commit e1899b5
Showing 1 changed file with 54 additions and 28 deletions.
82 changes: 54 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,55 @@
# Optimizing Adapters for Neural Machine Translation

This repositary hosts my experiments for the project, I did with OffNote Labs. **Blog post summarizing my project can be found** [**here**](https://medium.com/offnote-labs/build-a-model-which-can-translate-multiple-indian-languages-to-english-very-efficiently-reduce-55375fb0e1ea)
This repositary hosts my experiments for the project, I did with [OffNote Labs](https://github.com/ofnote). **Blog post summarizing my project can be found** [**here**](https://medium.com/offnote-labs/build-a-model-which-can-translate-multiple-indian-languages-to-english-very-efficiently-reduce-55375fb0e1ea).

## Content
## Details

mBART code is taken from **🤗Transformers** & I modified it to add adapters. I made changes to only this [folder](src/transformers/models/bart/) from 🤗Transformers. Rest of 🤗Transformers code is untouched.
`mBART` code is taken from **🤗Transformers** & I modified it to add adapters. I made changes to only this [folder](src/transformers/models/bart/) from 🤗Transformers. Rest of 🤗Transformers code is untouched.

Adapters specific code can be found [here](src/transformers/models/bart/adapter_modeling.py). Other boilerplate code can be found [here](src/). All the checkpoints can be found [here](https://drive.google.com/drive/folders/1NjzNfwYO4Hx2yNXyMLHVLgG4GKbHwTB4?usp=sharing).
Adapters specific code can be found [here](src/transformers/models/bart/adapter_modeling.py). Other boilerplate code can be found [here](src/).

**Checkpoints**

| Model details | Checkpoint | Size |
|-----------------------------|------------------------------------------------------------------------------------|--------|
| `hin->eng` adapters | [`adapters-hin-eng`](https://huggingface.co/vasudevgupta/offnote-mbart-adapters-bhasha/resolve/main/adapters-hin-eng.pt) | 104.6 MB |
| `hin->eng` Fine-tuned model | [`mbart-bhasha-hin-eng`](https://huggingface.co/vasudevgupta/mbart-bhasha-hin-eng) | 2.3 GB |
| `guj->eng` adapters | [`adapters-guj-eng`](https://huggingface.co/vasudevgupta/offnote-mbart-adapters-bhasha/resolve/main/adapters-guj-eng.pt) | 104.6 MB |
| `guj->eng` Fine-tuned model | [`mbart-bhasha-guj-eng`](https://huggingface.co/vasudevgupta/mbart-bhasha-guj-eng) | 2.3 GB |

*Note: Bhasha **guj-eng** dataset has only 59K samples; while **hin-eng** dataset has 260K samples*

Other checkpoints can be found [here](https://drive.google.com/drive/folders/1NjzNfwYO4Hx2yNXyMLHVLgG4GKbHwTB4?usp=sharing).

**Setting Up**

```shell
# install requirements
pip3 install -r requirements.txt
cd src

# downloading & unzipping dataset
wget http://preon.iiit.ac.in/~jerin/resources/datasets/pib_v1.3.tar.gz
unzip pib_v1.3.tar.gz

# Adapters checkpoints (it's just 200 MB) can be downloaded using:
git clone https://huggingface.co/vasudevgupta/offnote-mbart-adapters-bhasha
```

**Using this project**

```python
# run this from src/ directory
from transformers import MBartForConditionalGeneration, MBartTokenizer
from transformers.models.bart.adapter_modeling import AdapterConfig

# initialize mBART from pre-trained weights
tokenizer = MBartTokenizer.from_pretrained("facebook/mbart-large-cc25")
model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-cc25")

# Now let's add adapters which will enable multilingual translation

# deciding what all adapters to add
adapter_args = {
"dec_ffn_adapter": True,
"enc_self_attn_adapter": True,
Expand All @@ -25,44 +61,34 @@ adapter_args = {
"dec_tok_embed_adapter_config": AdapterConfig(input_size=1024),
}

model = MBartForConditionalGeneration.from_pretrained("vasudevgupta/offnote-mbart-adapters-bhasha")
# add randomly-initialized adapters
model.add_adapter_(**adapter_args)
model.load_adapter()

# model is ready for inference just like any other 🤗Transformers
```
# loading hin-eng adapters
model.load_adapter("offnote-mbart-adapters-bhasha/adapters-hin-eng.pt", map_location="cpu")

**Checkpoints**
# loading guj-eng adapters
model.load_adapter("offnote-mbart-adapters-bhasha/adapter-guj-eng.pt", map_location="cpu")

| Model details | checkpoint | BLEU |
|------------------------------------------------------------------------------|------------|-------|
| Adapters trained on Bhasha Hindi->English with base model `mbart-large-cc25` | https://huggingface.co/vasudevgupta/offnote-mbart-adapters-bhasha/resolve/main/adapters-hin-eng.pt | 25.11 |
| Adapters trained on Bhasha Guj->English with base model `mbart-large-cc25` | https://huggingface.co/vasudevgupta/offnote-mbart-adapters-bhasha/resolve/main/adapters-guj-eng.pt | |
# model is ready for inference just like any other 🤗Transformers
```

**Reproducing our results**

```shell
# downloading & unzipping dataset
wget http://preon.iiit.ac.in/~jerin/resources/datasets/pib_v1.3.tar.gz
unzip pib_v1.3.tar.gz

# install requirements
pip3 install -r requirements.txt
cd src
```

```shell
# Reproducing our results

# for normal fine-tuning
python3 train.py --config "main"

# for training best-adapters case on hin-eng
# for training with best-adapters case on hin-eng
python3 train.py --config "best_adapters_hin"
# for normal-finetuning on hin-eng
python3 train.py --config "full_train_hin"

# for training best-adapters case on hin-eng
# for training with best-adapters case on guj-eng
python3 train.py --config "best_adapters_guj"
# for normal-finetuning on guj-eng
python3 train.py --config "full_train_guj"

# for inference & bleu score logging over test, run this (after changing config)
python3 infer.py
```
```

0 comments on commit e1899b5

Please sign in to comment.