Quantize Speech Recognition Models with OpenVINO™ Post-Training Optimization Tool

This tutorial demonstrates how to apply INT8 quantization to the speech recognition model, known as Wav2Vec2, using the Post-Training Optimization Tool API (POT API) (part of OpenVINO Toolkit). A fine-tuned Wav2Vec2-Base-960h PyTorch model, trained on the LibriSpeech ASR corpus, is used here. The code of the tutorial is designed to be extendable to custom models and datasets.

Notebook Contents

The tutorial consists of the following steps:

Downloading and preparing the Wav2Vec2 model and LibriSpeech dataset.
Defining data loading and accuracy validation functionality.
Preparing the model for quantization.
Running optimization pipeline.
Comparing performance of the original and quantized models.