Skip to content

DistilBERT Question Answering model using SQuAD served with Flask, deployed on Google Cloud Run

License

Notifications You must be signed in to change notification settings

AwesomeGitHubRepos/distilbert-squad

 
 

Repository files navigation

DistilBERT-SQuAD

Build and Deploy to Cloud Run Docker Cloud Build Status Website

Try the demo at qa.oliverproud.com

What is DistilBERT?

Thanks to the brilliant people at Hugging Face 🤗 we now have DistilBERT, which stands for Distilated-BERT. DistilBERT is a small, fast, cheap and light Transformer model based on Bert architecture. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving 97% of BERT's performance as measured on the GLUE language understanding benchmark. DistilBERT is trained using knowledge distillation, a technique to compress a large model called the teacher into a smaller model called the student. By distillating Bert, we obtain a smaller Transformer model that bears a lot of similarities with the original BERT model while being lighter, smaller and faster to run. DistilBERT is thus an interesting option to put large-scaled trained Transformer model into production. Transformers - Hugging Face repository

Victor Sanh of Hugging Face wrote a great Medium post introducing DistilBERT and explaining parts of their newly released NeurIPS 2019 Workshop paper

The Stanford Question Answering Dataset (SQuAD)

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

https://rajpurkar.github.io/SQuAD-explorer/

Installation

If you are testing this on your own machine I would recommend you run it in a virtual environment or use Docker, as not to affect the rest of your files.

Docker

To use my docker repository:

docker pull oliverproud/distilbert-squad-flask

Run the container:

docker run -dp 8080:8080 oliverproud/distilbert-squad-flask

or Git Pull this repository and

Build the container:

docker build -t distilbert-squad-flask .

Run the container:

docker run -dp 8080:8080 distilbert-squad-flask

Python venv

In Python3 you can set up a virtual environment with

python3 -m venv /path/to/new/virtual/environment

Or by installing virtualenv with pip by doing

pip3 install virtualenv

Then creating the environment with

virtualenv venv

and finally activating it with

source venv/bin/activate

You must have Python3

Install the requirements with:

pip3 install -r requirements.txt

Contact

If you have any questions, feedback or problems of any kind, get in touch by messaging me on Twitter - @oliverwproud or by submitting an issue.

SQuAD Fine-tuned model

The SQuAD fine-tuned model is available in my S3 Bucket or alternatively inside the model.py file you can specify the type of model you wish to use, the one I have provided, or a Hugging Face fine-tuned SQuAD model

distilbert-base-uncased-distilled-squad.

You can do this with

model = DistilBertForQuestionAnswering.from_pretrained('distilbert-base-uncased-distilled-squad', config=config)

Deploying to Google Cloud Run

Please see this quick start guide from Google on how to deploy to Google Cloud Run.

If you follow just part one and create the project, then navigate to the Cloud Run console and find the button named SET UP CONTINUOUS DEPLOYMENT and from there you will be able to deploy directly from your GitHub repo using the Dockerfile.

Alternatively, follow the entire quick start guide.

Making predictions

You can test the model using test.py or using the provided Flask interface.

alt text

If you would like to try out the demo then head to qa.oliverproud.com

alt text

How to train (Distil)BERT

The data for SQuAD can be downloaded with the following links and should be saved in a $SQUAD_DIR directory.

Training on one Tesla V100 16GB GPU, each epoch took around 9 minutes to complete, in comparison training on a single Quadro M4000 the time for each epoch took over 2 hours, so don't be alarmed if your training isn't lightning fast.

export SQUAD_DIR=/path/to/SQUAD

python run_squad.py \
  --model_type distilbert \
  --model_name_or_path distilbert-base-uncased \
  --do_train \
  --do_eval \
  --do_lower_case \
  --train_file $SQUAD_DIR/train-v1.1.json \
  --predict_file $SQUAD_DIR/dev-v1.1.json \
  --per_gpu_train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 2.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /tmp/debug_squad/

References

About

DistilBERT Question Answering model using SQuAD served with Flask, deployed on Google Cloud Run

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 76.7%
  • Jupyter Notebook 20.8%
  • HTML 1.1%
  • Other 1.4%