fintuning DPLM results in worse generation #11

pengzhangzhi · 2024-10-08T14:14:52Z

Hi,

I simply loaded the pretrain weight and fine-tuned it on the same dataset, and got the ckpt that generates more repetitive sequences then I thought. This is quite bizarre to me. Is there something wrong with the current training code or the released ckpts are too good?

cc @zhengzx-nlp @wxy-nlp @leiyu-bytedance @lark

wxy-nlp · 2024-10-10T09:07:17Z

hello @pengzhangzhi ,

could you provide the generation results and the way you load the checkpoint?

By the way, if you use the config yaml in config/experiment/lm and continue train from the pretrained weight, the learning rate is large, which may result in the large change of pretrained weight and lead to a bad performance. So if you want to continue train, the learning rate starts from the ending rate, i.e., 1e-5, may be better.

pengzhangzhi · 2024-10-10T15:56:46Z

Hi @wxy-nlp ,

Thanks!!
I load the ckpt from the path:

c=dplm/byprot-checkpoints/dplm_150m_finetune_lr_1e-8/checkpoints/last.ckpt
python generate.py --model_name "airkingbd/${model_name}"         --seq_lens 100 --saveto ${output_dir} --num_seqs 100

I tried to set a smaller LR even 1e-8, but the fine-tuning would gradually degrade the pLDDT. Below is the comparison between the base DPLM-150M and the fine-tuned DPLM-150M with LR 1e-8:

pLDDT:
Base 
 69.44743
finetune
 66.5991

If I use LR 1e-5 or something larger than 1e-8, the generation is completely broken... : (
If you want to verify, you can simply set the LR to be 1e-5, load the ckpt, and fine-tune the mode for a couple thousand steps.

Also, could you please share the configs for DPLM-150M with us? I remember in the paper, you employ a two-stage training, I wonder the hyper-params for the two stages and the training steps. Would love to reproduce your training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fintuning DPLM results in worse generation #11

fintuning DPLM results in worse generation #11

pengzhangzhi commented Oct 8, 2024

wxy-nlp commented Oct 10, 2024

pengzhangzhi commented Oct 10, 2024 •

edited

Loading

fintuning DPLM results in worse generation #11

fintuning DPLM results in worse generation #11

Comments

pengzhangzhi commented Oct 8, 2024

wxy-nlp commented Oct 10, 2024

pengzhangzhi commented Oct 10, 2024 • edited Loading

pengzhangzhi commented Oct 10, 2024 •

edited

Loading