DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Roysky · 2025-02-26T01:26:55Z

after DeepSeek 671B Fine-Tuning , how to load the Original model and lora model to test ?

help ~

lean-wang · 2025-02-26T12:25:07Z

We used the following code for the conversion:

from peft import PeftModel, PeftConfig
from transformers import AutoModel, AutoTokenizer
import torch

# Load configuration and model
peft_model_id = "./DeepSeek-R1-bf16-lora/lora"
peft_config = PeftConfig.from_pretrained(peft_model_id, trust_remote_code=True)
base_model = AutoModel.from_pretrained(peft_config.base_model_name_or_path, trust_remote_code=True)

# Load the model with LoRA parameters
model = PeftModel.from_pretrained(base_model, peft_model_id)

# Merge LoRA weights into the base model
model = model.merge_and_unload()

# Now, 'model' contains the merged weights and can be used for inference or saved as a new model
model.save_pretrained("./model_merge")

Our machine has 2TB of memory, The swap space is 2TB, and we have attempted to merge the LoRA weights, but it reported missing weights. There are over 1000 layers of weights missing after merging the LoRA weights compared to before..

wotulong · 2025-02-26T16:30:08Z

what's your fine tune env? thks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Roysky commented Feb 26, 2025

lean-wang commented Feb 26, 2025 •

edited

Loading

wotulong commented Feb 26, 2025

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Comments

Roysky commented Feb 26, 2025

lean-wang commented Feb 26, 2025 • edited Loading

wotulong commented Feb 26, 2025

lean-wang commented Feb 26, 2025 •

edited

Loading