Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Can not load model using embeddings with PEFT #3623

Open
MattGPT-ai opened this issue Feb 26, 2025 · 0 comments
Open

[Bug]: Can not load model using embeddings with PEFT #3623

MattGPT-ai opened this issue Feb 26, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@MattGPT-ai
Copy link
Contributor

Describe the bug

When using a LoRA config with transformer embeddings, a model such as TextClassifier will fail to load, as the state dict keys have changed

To Reproduce

import flair
from peft import LoraConfig, TaskType

peft_config = LoraConfig(
    task_type=TaskType.FEATURE_EXTRACTION,
    inference_mode=False,
)

peft_embeddings = flair.embeddings.TransformerEmbeddings(peft_config=peft_config)

model = flair.models.TextClassifier(embeddings=peft_embeddings, label_type='label', label_dictionary=flair.data.Dictionary())

model.save('/tmp/bert_perf.pt')

model = flair.models.TextClassifier.load('/tmp/bert_perf.pt')

Expected behavior

The model to load successfully

Logs and Stack traces

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 1
----> 1 model = flair.models.TextClassifier.load('/tmp/bert_perf.pt')

File /pyzr/active_venv/lib/python3.12/site-packages/flair/models/text_classification_model.py:139, in TextClassifier.load(cls, model_path)
    135 @classmethod
    136 def load(cls, model_path: Union[str, Path, dict[str, Any]]) -> "TextClassifier":
    137     from typing import cast
--> 139     return cast("TextClassifier", super().load(model_path=model_path))

File /pyzr/active_venv/lib/python3.12/site-packages/flair/nn/model.py:1012, in DefaultClassifier.load(cls, model_path)
   1008 @classmethod
   1009 def load(cls, model_path: Union[str, Path, dict[str, Any]]) -> "DefaultClassifier":
   1010     from typing import cast
-> 1012     return cast("DefaultClassifier", super().load(model_path=model_path))

File /pyzr/active_venv/lib/python3.12/site-packages/flair/nn/model.py:583, in Classifier.load(cls, model_path)
    579 @classmethod
    580 def load(cls, model_path: Union[str, Path, dict[str, Any]]) -> "Classifier":
    581     from typing import cast
--> 583     return cast("Classifier", super().load(model_path=model_path))

File /pyzr/active_venv/lib/python3.12/site-packages/flair/nn/model.py:200, in Model.load(cls, model_path)
    197 if "__cls__" in state:
    198     state.pop("__cls__")
--> 200 model = cls._init_model_with_state_dict(state)
    202 if "model_card" in state:
    203     model.model_card = state["model_card"]

File /pyzr/active_venv/lib/python3.12/site-packages/flair/models/text_classification_model.py:83, in TextClassifier._init_model_with_state_dict(cls, state, **kwargs)
     80 for key in list(state_dict.keys()):
     81     state_dict[re.sub("^document_embeddings\\.", "embeddings.", key)] = state_dict.pop(key)
---> 83 return super()._init_model_with_state_dict(
     84     state,
     85     embeddings=state.get("document_embeddings"),
     86     label_dictionary=state.get("label_dictionary"),
     87     label_type=state.get("label_type"),
     88     multi_label=state.get("multi_label"),
     89     multi_label_threshold=state.get("multi_label_threshold", 0.5),
     90     loss_weights=state.get("weight_dict"),
     91     **kwargs,
     92 )

File /pyzr/active_venv/lib/python3.12/site-packages/flair/nn/model.py:989, in DefaultClassifier._init_model_with_state_dict(cls, state, **kwargs)
    986     if arg not in kwargs and arg in state:
    987         kwargs[arg] = state[arg]
--> 989 return super(Classifier, cls)._init_model_with_state_dict(state, **kwargs)

File /pyzr/active_venv/lib/python3.12/site-packages/flair/nn/model.py:106, in Model._init_model_with_state_dict(cls, state, **kwargs)
    102     kwargs["embeddings"] = embeddings
    104 model = cls(**kwargs)
--> 106 model.load_state_dict(state["state_dict"])
    108 return model

File /pyzr/active_venv/lib/python3.12/site-packages/torch/nn/modules/module.py:2581, in Module.load_state_dict(self, state_dict, strict, assign)
   2573         error_msgs.insert(
   2574             0,
   2575             "Missing key(s) in state_dict: {}. ".format(
   2576                 ", ".join(f'"{k}"' for k in missing_keys)
   2577             ),
   2578         )
   2580 if len(error_msgs) > 0:
-> 2581     raise RuntimeError(
   2582         "Error(s) in loading state_dict for {}:\n\t{}".format(
   2583             self.__class__.__name__, "\n\t".join(error_msgs)
   2584         )
   2585     )
   2586 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for TextClassifier:
	Missing key(s) in state_dict: "embeddings.model.embeddings.word_embeddings.weight", "embeddings.model.embeddings.position_embeddings.weight", "embeddings.model.embeddings.token_type_embeddings.weight", "embeddings.model.embeddings.LayerNorm.weight", "embeddings.model.embeddings.LayerNorm.bias", "embeddings.model.encoder.layer.0.attention.self.query.weight", "embeddings.model.encoder.layer.0.attention.self.query.bias", "embeddings.model.encoder.layer.0.attention.self.key.weight", "embeddings.model.encoder.layer.0.attention.self.key.bias", "embeddings.model.encoder.layer.0.attention.self.value.weight", "embeddings.model.encoder.layer.0.attention.self.value.bias", "embeddings.model.encoder.layer.0.attention.output.dense.weight", "embeddings.model.encoder.layer.0.attention.output.dense.bias", "embeddings.model.encoder.layer.0.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.0.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.0.intermediate.dense.weight", "embeddings.model.encoder.layer.0.intermediate.dense.bias", "embeddings.model.encoder.layer.0.output.dense.weight", "embeddings.model.encoder.layer.0.output.dense.bias", "embeddings.model.encoder.layer.0.output.LayerNorm.weight", "embeddings.model.encoder.layer.0.output.LayerNorm.bias", "embeddings.model.encoder.layer.1.attention.self.query.weight", "embeddings.model.encoder.layer.1.attention.self.query.bias", "embeddings.model.encoder.layer.1.attention.self.key.weight", "embeddings.model.encoder.layer.1.attention.self.key.bias", "embeddings.model.encoder.layer.1.attention.self.value.weight", "embeddings.model.encoder.layer.1.attention.self.value.bias", "embeddings.model.encoder.layer.1.attention.output.dense.weight", "embeddings.model.encoder.layer.1.attention.output.dense.bias", "embeddings.model.encoder.layer.1.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.1.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.1.intermediate.dense.weight", "embeddings.model.encoder.layer.1.intermediate.dense.bias", "embeddings.model.encoder.layer.1.output.dense.weight", "embeddings.model.encoder.layer.1.output.dense.bias", "embeddings.model.encoder.layer.1.output.LayerNorm.weight", "embeddings.model.encoder.layer.1.output.LayerNorm.bias", "embeddings.model.encoder.layer.2.attention.self.query.weight", "embeddings.model.encoder.layer.2.attention.self.query.bias", "embeddings.model.encoder.layer.2.attention.self.key.weight", "embeddings.model.encoder.layer.2.attention.self.key.bias", "embeddings.model.encoder.layer.2.attention.self.value.weight", "embeddings.model.encoder.layer.2.attention.self.value.bias", "embeddings.model.encoder.layer.2.attention.output.dense.weight", "embeddings.model.encoder.layer.2.attention.output.dense.bias", "embeddings.model.encoder.layer.2.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.2.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.2.intermediate.dense.weight", "embeddings.model.encoder.layer.2.intermediate.dense.bias", "embeddings.model.encoder.layer.2.output.dense.weight", "embeddings.model.encoder.layer.2.output.dense.bias", "embeddings.model.encoder.layer.2.output.LayerNorm.weight", "embeddings.model.encoder.layer.2.output.LayerNorm.bias", "embeddings.model.encoder.layer.3.attention.self.query.weight", "embeddings.model.encoder.layer.3.attention.self.query.bias", "embeddings.model.encoder.layer.3.attention.self.key.weight", "embeddings.model.encoder.layer.3.attention.self.key.bias", "embeddings.model.encoder.layer.3.attention.self.value.weight", "embeddings.model.encoder.layer.3.attention.self.value.bias", "embeddings.model.encoder.layer.3.attention.output.dense.weight", "embeddings.model.encoder.layer.3.attention.output.dense.bias", "embeddings.model.encoder.layer.3.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.3.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.3.intermediate.dense.weight", "embeddings.model.encoder.layer.3.intermediate.dense.bias", "embeddings.model.encoder.layer.3.output.dense.weight", "embeddings.model.encoder.layer.3.output.dense.bias", "embeddings.model.encoder.layer.3.output.LayerNorm.weight", "embeddings.model.encoder.layer.3.output.LayerNorm.bias", "embeddings.model.encoder.layer.4.attention.self.query.weight", "embeddings.model.encoder.layer.4.attention.self.query.bias", "embeddings.model.encoder.layer.4.attention.self.key.weight", "embeddings.model.encoder.layer.4.attention.self.key.bias", "embeddings.model.encoder.layer.4.attention.self.value.weight", "embeddings.model.encoder.layer.4.attention.self.value.bias", "embeddings.model.encoder.layer.4.attention.output.dense.weight", "embeddings.model.encoder.layer.4.attention.output.dense.bias", "embeddings.model.encoder.layer.4.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.4.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.4.intermediate.dense.weight", "embeddings.model.encoder.layer.4.intermediate.dense.bias", "embeddings.model.encoder.layer.4.output.dense.weight", "embeddings.model.encoder.layer.4.output.dense.bias", "embeddings.model.encoder.layer.4.output.LayerNorm.weight", "embeddings.model.encoder.layer.4.output.LayerNorm.bias", "embeddings.model.encoder.layer.5.attention.self.query.weight", "embeddings.model.encoder.layer.5.attention.self.query.bias", "embeddings.model.encoder.layer.5.attention.self.key.weight", "embeddings.model.encoder.layer.5.attention.self.key.bias", "embeddings.model.encoder.layer.5.attention.self.value.weight", "embeddings.model.encoder.layer.5.attention.self.value.bias", "embeddings.model.encoder.layer.5.attention.output.dense.weight", "embeddings.model.encoder.layer.5.attention.output.dense.bias", "embeddings.model.encoder.layer.5.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.5.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.5.intermediate.dense.weight", "embeddings.model.encoder.layer.5.intermediate.dense.bias", "embeddings.model.encoder.layer.5.output.dense.weight", "embeddings.model.encoder.layer.5.output.dense.bias", "embeddings.model.encoder.layer.5.output.LayerNorm.weight", "embeddings.model.encoder.layer.5.output.LayerNorm.bias", "embeddings.model.encoder.layer.6.attention.self.query.weight", "embeddings.model.encoder.layer.6.attention.self.query.bias", "embeddings.model.encoder.layer.6.attention.self.key.weight", "embeddings.model.encoder.layer.6.attention.self.key.bias", "embeddings.model.encoder.layer.6.attention.self.value.weight", "embeddings.model.encoder.layer.6.attention.self.value.bias", "embeddings.model.encoder.layer.6.attention.output.dense.weight", "embeddings.model.encoder.layer.6.attention.output.dense.bias", "embeddings.model.encoder.layer.6.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.6.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.6.intermediate.dense.weight", "embeddings.model.encoder.layer.6.intermediate.dense.bias", "embeddings.model.encoder.layer.6.output.dense.weight", "embeddings.model.encoder.layer.6.output.dense.bias", "embeddings.model.encoder.layer.6.output.LayerNorm.weight", "embeddings.model.encoder.layer.6.output.LayerNorm.bias", "embeddings.model.encoder.layer.7.attention.self.query.weight", "embeddings.model.encoder.layer.7.attention.self.query.bias", "embeddings.model.encoder.layer.7.attention.self.key.weight", "embeddings.model.encoder.layer.7.attention.self.key.bias", "embeddings.model.encoder.layer.7.attention.self.value.weight", "embeddings.model.encoder.layer.7.attention.self.value.bias", "embeddings.model.encoder.layer.7.attention.output.dense.weight", "embeddings.model.encoder.layer.7.attention.output.dense.bias", "embeddings.model.encoder.layer.7.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.7.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.7.intermediate.dense.weight", "embeddings.model.encoder.layer.7.intermediate.dense.bias", "embeddings.model.encoder.layer.7.output.dense.weight", "embeddings.model.encoder.layer.7.output.dense.bias", "embeddings.model.encoder.layer.7.output.LayerNorm.weight", "embeddings.model.encoder.layer.7.output.LayerNorm.bias", "embeddings.model.encoder.layer.8.attention.self.query.weight", "embeddings.model.encoder.layer.8.attention.self.query.bias", "embeddings.model.encoder.layer.8.attention.self.key.weight", "embeddings.model.encoder.layer.8.attention.self.key.bias", "embeddings.model.encoder.layer.8.attention.self.value.weight", "embeddings.model.encoder.layer.8.attention.self.value.bias", "embeddings.model.encoder.layer.8.attention.output.dense.weight", "embeddings.model.encoder.layer.8.attention.output.dense.bias", "embeddings.model.encoder.layer.8.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.8.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.8.intermediate.dense.weight", "embeddings.model.encoder.layer.8.intermediate.dense.bias", "embeddings.model.encoder.layer.8.output.dense.weight", "embeddings.model.encoder.layer.8.output.dense.bias", "embeddings.model.encoder.layer.8.output.LayerNorm.weight", "embeddings.model.encoder.layer.8.output.LayerNorm.bias", "embeddings.model.encoder.layer.9.attention.self.query.weight", "embeddings.model.encoder.layer.9.attention.self.query.bias", "embeddings.model.encoder.layer.9.attention.self.key.weight", "embeddings.model.encoder.layer.9.attention.self.key.bias", "embeddings.model.encoder.layer.9.attention.self.value.weight", "embeddings.model.encoder.layer.9.attention.self.value.bias", "embeddings.model.encoder.layer.9.attention.output.dense.weight", "embeddings.model.encoder.layer.9.attention.output.dense.bias", "embeddings.model.encoder.layer.9.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.9.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.9.intermediate.dense.weight", "embeddings.model.encoder.layer.9.intermediate.dense.bias", "embeddings.model.encoder.layer.9.output.dense.weight", "embeddings.model.encoder.layer.9.output.dense.bias", "embeddings.model.encoder.layer.9.output.LayerNorm.weight", "embeddings.model.encoder.layer.9.output.LayerNorm.bias", "embeddings.model.encoder.layer.10.attention.self.query.weight", "embeddings.model.encoder.layer.10.attention.self.query.bias", "embeddings.model.encoder.layer.10.attention.self.key.weight", "embeddings.model.encoder.layer.10.attention.self.key.bias", "embeddings.model.encoder.layer.10.attention.self.value.weight", "embeddings.model.encoder.layer.10.attention.self.value.bias", "embeddings.model.encoder.layer.10.attention.output.dense.weight", "embeddings.model.encoder.layer.10.attention.output.dense.bias", "embeddings.model.encoder.layer.10.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.10.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.10.intermediate.dense.weight", "embeddings.model.encoder.layer.10.intermediate.dense.bias", "embeddings.model.encoder.layer.10.output.dense.weight", "embeddings.model.encoder.layer.10.output.dense.bias", "embeddings.model.encoder.layer.10.output.LayerNorm.weight", "embeddings.model.encoder.layer.10.output.LayerNorm.bias", "embeddings.model.encoder.layer.11.attention.self.query.weight", "embeddings.model.encoder.layer.11.attention.self.query.bias", "embeddings.model.encoder.layer.11.attention.self.key.weight", "embeddings.model.encoder.layer.11.attention.self.key.bias", "embeddings.model.encoder.layer.11.attention.self.value.weight", "embeddings.model.encoder.layer.11.attention.self.value.bias", "embeddings.model.encoder.layer.11.attention.output.dense.weight", "embeddings.model.encoder.layer.11.attention.output.dense.bias", "embeddings.model.encoder.layer.11.attention.output.LayerNorm.weight", "embeddings.model.encoder.layer.11.attention.output.LayerNorm.bias", "embeddings.model.encoder.layer.11.intermediate.dense.weight", "embeddings.model.encoder.layer.11.intermediate.dense.bias", "embeddings.model.encoder.layer.11.output.dense.weight", "embeddings.model.encoder.layer.11.output.dense.bias", "embeddings.model.encoder.layer.11.output.LayerNorm.weight", "embeddings.model.encoder.layer.11.output.LayerNorm.bias", "embeddings.model.pooler.dense.weight", "embeddings.model.pooler.dense.bias". 
	Unexpected key(s) in state_dict: "embeddings.model.base_model.model.embeddings.word_embeddings.weight", "embeddings.model.base_model.model.embeddings.position_embeddings.weight", "embeddings.model.base_model.model.embeddings.token_type_embeddings.weight", "embeddings.model.base_model.model.embeddings.LayerNorm.weight", "embeddings.model.base_model.model.embeddings.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.0.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.0.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.0.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.0.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.0.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.0.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.0.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.0.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.0.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.0.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.0.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.0.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.1.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.1.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.1.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.1.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.1.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.1.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.1.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.1.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.1.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.1.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.1.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.1.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.2.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.2.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.2.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.2.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.2.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.2.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.2.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.2.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.2.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.2.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.2.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.2.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.3.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.3.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.3.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.3.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.3.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.3.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.3.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.3.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.3.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.3.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.3.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.3.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.4.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.4.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.4.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.4.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.4.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.4.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.4.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.4.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.4.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.4.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.4.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.4.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.5.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.5.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.5.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.5.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.5.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.5.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.5.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.5.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.5.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.5.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.5.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.5.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.6.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.6.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.6.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.6.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.6.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.6.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.6.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.6.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.6.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.6.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.6.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.6.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.7.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.7.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.7.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.7.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.7.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.7.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.7.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.7.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.7.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.7.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.7.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.7.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.8.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.8.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.8.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.8.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.8.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.8.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.8.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.8.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.8.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.8.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.8.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.8.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.9.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.9.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.9.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.9.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.9.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.9.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.9.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.9.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.9.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.9.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.9.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.9.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.10.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.10.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.10.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.10.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.10.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.10.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.10.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.10.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.10.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.10.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.10.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.10.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.11.attention.self.query.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.query.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.11.attention.self.query.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.query.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.key.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.key.bias", "embeddings.model.base_model.model.encoder.layer.11.attention.self.value.base_layer.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.value.base_layer.bias", "embeddings.model.base_model.model.encoder.layer.11.attention.self.value.lora_A.default.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.self.value.lora_B.default.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.11.attention.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.11.attention.output.LayerNorm.bias", "embeddings.model.base_model.model.encoder.layer.11.intermediate.dense.weight", "embeddings.model.base_model.model.encoder.layer.11.intermediate.dense.bias", "embeddings.model.base_model.model.encoder.layer.11.output.dense.weight", "embeddings.model.base_model.model.encoder.layer.11.output.dense.bias", "embeddings.model.base_model.model.encoder.layer.11.output.LayerNorm.weight", "embeddings.model.base_model.model.encoder.layer.11.output.LayerNorm.bias", "embeddings.model.base_model.model.pooler.dense.weight", "embeddings.model.base_model.model.pooler.dense.bias".

Screenshots

No response

Additional Context

No response

Environment

Versions:

Flair

0.15.1

Pytorch

2.6.0+cu124

Transformers

4.49.0

GPU

False

PEFT

0.14.0

@MattGPT-ai MattGPT-ai added the bug Something isn't working label Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant