-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some question #6
Comments
Why doesn't the quantity under 'data' correspond to that in 'api_input'? |
llm_eval.py will examine the results except example_constraints. Example_constraints will be checked by rule instead of llm. After running llm_eval.py, you should run eval.py to get the final result. |
model_inference.py 是要评估的模型 代码写的重复度太高了,😂,读起来太费劲儿。 |
我正在重构他这个代码,想问一下他这个rule base的评测和llm base的评测的题目应该是不一样的吧,应该可以拆成两份数据分别跑?他现在弄了一堆if看的晕头转向的完全没理解操作 |
评测方式:rule base (规则) + llm base(评估模型(gpt)) example这个数据子集跑了(HSR, CLS)指标,特殊的,直接用了待评估模型数据跑的。 rule base是评待评估模型(lam3 etc) 数据,llm base 是用gpt去跑lam3 的结果, 一般叫打标,llm base并没有去统计结果。 |
他的代码里有一个让我感到很迷惑的点,不知道是不是我的问题: |
Hello, I have a question: After I executed model_inference.py and got the results, do I need to use my own model to infer all the questions before executing llm_eval.py? What will the result be after the inference is completed? Because I saw parameters such as gpt4_discriminative_eval_input_path in llm_eval.py, I don't understand how this works. Looking forward to your reply.
@YJiangcm
The text was updated successfully, but these errors were encountered: