You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I trained a GPT-2 model from scratch using LanguageModelingModel. This was saved to disk. I then started a new process and tried to load it, and it reported:
RuntimeError: Error(s) in loading state_dict for GPT2LMHeadModel:
size mismatch for transformer.wte.weight: copying a param with shape torch.Size([375, 768]) from checkpoint, the shape in current model is torch.Size([10000, 768]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([375, 768]) from checkpoint, the shape in current model is torch.Size([10000, 768]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
To Reproduce
Generate a model using the train_new_lm.py script shipped in the examples directory. Try to load the model with:
from simpletransformers.language_modeling import LanguageModelingModel
model = LanguageModelingModel(
"gpt2",
"./outputs/from_scratch/best_model",
)
Expected behavior
No exception.
Desktop (please complete the following information):
Linux
The text was updated successfully, but these errors were encountered:
Describe the bug
I trained a GPT-2 model from scratch using
LanguageModelingModel
. This was saved to disk. I then started a new process and tried to load it, and it reported:To Reproduce
Generate a model using the
train_new_lm.py
script shipped in theexamples
directory. Try to load the model with:Expected behavior
No exception.
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: