Skip to content

Commit

Permalink
Fix T5 and mistral model meta data error (#4958)
Browse files Browse the repository at this point in the history
Fix 'NotImplementedError: Cannot copy out of meta tensor; no data!',
when loading T5 and mistral from device meta.

Co-authored-by: Logan Adams <[email protected]>
  • Loading branch information
Yejing-Lai and loadams authored Jan 19, 2024
1 parent 96c5a87 commit e62a47e
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion deepspeed/module_inject/auto_tp.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,8 @@ class Loading():
def is_load_module(module):
load_layers = [nn.Linear, nn.Embedding, nn.LayerNorm]
load_layer_names = [
"LPLayerNorm", "SharedEmbedding", "OPTLearnedPositionalEmbedding", "LlamaRMSNorm", "FalconLinear"
"LPLayerNorm", "SharedEmbedding", "OPTLearnedPositionalEmbedding", "LlamaRMSNorm", "FalconLinear",
"MistralRMSNorm", "T5LayerNorm"
]
return module.__class__ in load_layers or module._get_name() in load_layer_names

Expand Down

0 comments on commit e62a47e

Please sign in to comment.