Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Open
Roysky opened this issue Feb 26, 2025 · 2 comments
Open

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Roysky opened this issue Feb 26, 2025 · 2 comments

Comments

@Roysky
Copy link

Roysky commented Feb 26, 2025

after DeepSeek 671B Fine-Tuning , how to load the Original model and lora model to test ?

help ~

@lean-wang
Copy link

lean-wang commented Feb 26, 2025

We used the following code for the conversion:

from peft import PeftModel, PeftConfig
from transformers import AutoModel, AutoTokenizer
import torch

# Load configuration and model
peft_model_id = "./DeepSeek-R1-bf16-lora/lora"
peft_config = PeftConfig.from_pretrained(peft_model_id, trust_remote_code=True)
base_model = AutoModel.from_pretrained(peft_config.base_model_name_or_path, trust_remote_code=True)

# Load the model with LoRA parameters
model = PeftModel.from_pretrained(base_model, peft_model_id)

# Merge LoRA weights into the base model
model = model.merge_and_unload()

# Now, 'model' contains the merged weights and can be used for inference or saved as a new model
model.save_pretrained("./model_merge")

Our machine has 2TB of memory, The swap space is 2TB, and we have attempted to merge the LoRA weights, but it reported missing weights. There are over 1000 layers of weights missing after merging the LoRA weights compared to before..

@wotulong
Copy link

what's your fine tune env? thks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants