In this demo, we will use the Hugging Faces transformers
and datasets
library with Amazon SageMaker to fine-tune a pre-trained transformer on binary text classification. In particular, we will use the pre-trained DistilBERT model with the Amazon Reviews Polarity dataset.
We will then deploy the resulting model for inference using SageMaker Endpoint.
We'll be using an offshoot of BERT called DistilBERT that is smaller, and so faster and cheaper for both training and inference. A pre-trained model is available in the transformers
library from Hugging Face.
The Amazon Reviews Polarity dataset consists of reviews from Amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. It's avalaible under the amazon_polarity
dataset on Hugging Face.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.