Skip to content

models gpt2 medium

github-actions[bot] edited this page Jun 16, 2023 · 26 revisions

gpt2-medium

Overview

Description: The GPT-2 Transformer-based language model is designed primarily for use by AI researchers and practitioners. The intended uses of the language model include understanding the behavior, capability, biases, and constraints of large-scale generative language models. Secondary use cases of the language model include writing assistance, creative writing and art, and entertainment. The model is trained on a big corpus of English data, but the training data has not been released for public browsing. The authors of the model do not support use of the model to generate text that requires truth, nor do they recommend deployment of the language model in systems that interact with humans because of the biases inherent in the model. The model was pretrained on raw texts in a self-supervised manner with an automatic process that generates inputs and labels based on the words in the texts. The inputs were tokenized using byte-level pairing encoding with a vocabulary size of 50,257 and a sequence length of 1024 tokens. The authors of the model warn that the predictions generated by the model may contain harmful stereotypes and offensive content. > The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model. ### Inference samples Inference type|Python sample (Notebook)|CLI with YAML |--|--|--| Real time|text-generation-online-endpoint.ipynb|text-generation-online-endpoint.sh Batch |text-generation-batch-endpoint.ipynb| coming soon ### Finetuning samples Task|Use case|Dataset|Python sample (Notebook)|CLI with YAML |--|--|--|--|--| Text Classification|Emotion Detection|Emotion|emotion-detection.ipynb|emotion-detection.sh Token Classification|Named Entity Recognition|Conll2003|named-entity-recognition.ipynb|named-entity-recognition.sh ### Model Evaluation Task| Use case| Dataset| Python sample (Notebook)| CLI with YAML |--|--|--|--|--| Text generation | Text generation | cnn_dailymail | evaluate-model-text-generation.ipynb | evaluate-model-text-generation.yml ### Sample inputs and outputs (for real-time inference) #### Sample input json { "inputs": { "input_string": ["My name is John and I am", "Once upon a time,"] } } #### Sample output json [ { "0": "My name is John and I am part of the world's largest open computer lab, one of the largest academic computer labs in the world, with over" }, { "0": "Once upon a time, when I was twenty myself, I read John Milton's Paradise Lost in a small magazine. I was struck by a passage:" } ]

Version: 6

Tags

Preview license : mit task : text-generation

View in Studio: https://ml.azure.com/registries/azureml/models/gpt2-medium/version/6

License: mit

Properties

SHA: 425b0cc90498ac177aa51ba07be26fc2fea6af9d

datasets:

evaluation-min-sku-spec: 2|0|14|28

evaluation-recommended-sku: Standard_DS3_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: text-classification, token-classification

inference-min-sku-spec: 2|0|14|28

inference-recommended-sku: Standard_DS3_v2

languages: en

Clone this wiki locally