Post-Training Quantization of PyTorch models with NNCF

This tutorial demonstrates how to use NNCF 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize a PyTorch model for high-speed inference via OpenVINO Toolkit. For more advanced NNCF usage, refer to these examples.

To speed up download and validation, this tutorial uses a pre-trained ResNet-50 model on the Tiny ImageNet dataset.

Notebook contents

The tutorial consists of the following steps:

Evaluating the original model.
Transforming the original FP32 model to INT8.
Exporting optimized and original models to ONNX and then to OpenVINO IR.
Comparing performance of the obtained FP32 and INT8 models.

Installation Instructions

If you have not installed all required dependencies, follow the Installation Guide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Post-Training Quantization of PyTorch models with NNCF

Notebook contents

Installation Instructions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Post-Training Quantization of PyTorch models with NNCF

Notebook contents

Installation Instructions