-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用官方提供的dpo数据集模板报错 #2968
Comments
你解决了吗,我也遇到了同样的问题 |
暂时没有,等魔塔的大佬来解答 |
不好意,我的应该和你的不一样,刚刚解决了,是我自己数据弄错了 |
我这里测试是正常的 尝试升级一下ms-swift试试呢 |
好的,我试试 |
大佬您好,我重新安装最新版本的ms-swift(3.0.3版本),运行下面的dpo指令: NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2 --rlhf_type dpo --model /data/Telechat/TeleChat2/TeleChat2-7B --dataset /data/Telechat/dpo_refusal_dataset_official.jsonl --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --ddp_find_unused_parameters true --warmup_ratio 0.05 --dataloader_num_workers 4 --deepspeed zero2 有新报错如下:
应该是telechat2 7b模型的问题,我之前好像修改过参数,我试试modelscope官方原版的。经过测试,原版的一样有这个问题,麻烦大佬帮忙看下能否解决 |
可以看下您的运行参数吗? |
请问需要多大显卡资源能跑呢 |
--dtype float16 或者 float32试试 |
当我使用官方提供的dpo数据集模板:
制作成数据集:/data/Telechat/dpo_refusal_dataset_official.jsonl,参与下面的微调训练中。若不添加此数据集,只使用hjh0119/shareAI-Llama3-DPO-zh-en-emoji,则可以正常训练
数据集内容:
运行dpo微调训练指令,参考脚本:
NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2-115b --rlhf_type dpo --model_id_or_path /data/TeleChat2-7B --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji#100 /data/Telechat/dpo_refusal_dataset_official.jsonl#70 --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --warmup_ratio 0.05 --dataloader_num_workers 4
报错如下:
Your hardware and system info
ms-swift Version: 2.6.1
The text was updated successfully, but these errors were encountered: