在lcqmc数据集上微调效果下降 #19

elihuan1990 · 2021-08-24T11:39:00Z

在lcqmc数据集上微调simbert，在测试集上spearman指标下降一个点，怎么微调simbert呢？

bojone · 2021-12-29T04:08:22Z

可以用sentence-bert的方式微调

WenTingTseng · 2023-08-26T16:29:02Z

請問simbert.py訓練完模型並儲存best_model.weights了
我要如何加載best_model.weights模型並測試
`from bert4keras.tokenizers import Tokenizer
from bert4keras.models import build_transformer_model
from keras.models import Model
import numpy as np

config_path = '/home/rca/research/simbert/root/kg/bert/chinese_simbert_L-12_H-768_A-12/bert_config.json'
checkpoint_path = './latest_model.ckpt'
dict_path = '/home/rca/research/simbert/root/kg/bert/chinese_simbert_L-12_H-768_A-12/vocab.txt'

tokenizer = Tokenizer(dict_path, do_lower_case=True)

bert = build_transformer_model(
config_path,
checkpoint_path,
with_pool='linear',
application='unilm',
return_keras_model=False,
)
model = Model(inputs=bert.model.inputs, outputs=bert.model.outputs)
model.load_weights(checkpoint_path, by_name=True) # 加载权重时需要加上 by_name=True

test_sentence = "微信和支付宝哪个好？"

def gen_similar_sentences(text, n=10, k=10):
similar_sentences = gen_synonyms(text, n, k) # 需要定义 gen_synonyms 函数
return similar_sentences

token_ids, segment_ids = tokenizer.encode(test_sentence, max_length=maxlen)

output_ids = model.predict([np.array([token_ids]), np.array([segment_ids])])
output_ids = output_ids[0].argmax(axis=1)

generated_sentence = tokenizer.decode(output_ids)

print(f"原句子：{test_sentence}")
print(f"生成句子：{generated_sentence}")
print("相似句子：")
similar_sentences = gen_similar_sentences(test_sentence)
for idx, sentence in enumerate(similar_sentences):
print(f"{idx + 1}. {sentence}")`
是這樣寫嗎

HelenGuohx · 2024-02-19T06:17:01Z

我的方法是直接 from simbert import gen_synonyms，这样模型会加载新的权重

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在lcqmc数据集上微调效果下降 #19

在lcqmc数据集上微调效果下降 #19

elihuan1990 commented Aug 24, 2021

bojone commented Dec 29, 2021

WenTingTseng commented Aug 26, 2023 •

edited

Loading

HelenGuohx commented Feb 19, 2024

在lcqmc数据集上微调效果下降 #19

在lcqmc数据集上微调效果下降 #19

Comments

elihuan1990 commented Aug 24, 2021

bojone commented Dec 29, 2021

WenTingTseng commented Aug 26, 2023 • edited Loading

HelenGuohx commented Feb 19, 2024

WenTingTseng commented Aug 26, 2023 •

edited

Loading