LDL-Reward-Gemma-2-27B-v0.1 #215

ShikaiChen · 2025-02-06T03:08:34Z

Reward Model based on Label Distribution Learning:
ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1

to run it:
python ./scripts/run_rm.py --model ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1 --batch_size 1

The expected results are:

{
"alpacaeval-easy": 0.99,
"alpacaeval-hard": 0.9473684210526315,
"alpacaeval-length": 0.9368421052631579,
"donotanswer": 0.8161764705882353,
"hep-cpp": 0.9817073170731707,
"hep-go": 0.9817073170731707,
"hep-java": 0.9939024390243902,
"hep-js": 0.9939024390243902,
"hep-python": 0.975609756097561,
"hep-rust": 0.9573170731707317,
"llmbar-adver-GPTInst": 0.9565217391304348,
"llmbar-adver-GPTOut": 0.8297872340425532,
"llmbar-adver-manual": 0.8478260869565217,
"llmbar-adver-neighbor":0.9104477611940298,
"llmbar-natural": 0.94,
"math-prm": 1.0,
"mt-bench-easy": 1.0,
"mt-bench-hard": 0.8648648648648649,
"mt-bench-med": 0.975,
"refusals-dangerous": 0.97,
"refusals-offensive": 0.99,
"xstest-should-refuse": 0.948051948051948,
"xstest-should-respond": 0.964,
"model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1",
"model_type": "Seq. Classifier",
}
{'Chat': 0.9636871508379888, 'Chat Hard': 0.9078947368421053, 'Safety': 0.9378378378378378, 'Reasoning': 0.9903455284552846}

natolambert · 2025-02-06T23:11:28Z

Hey @ShikaiChen can you run:

make quality
make style

and fix any issues that show up? Thanks!

ShikaiChen · 2025-02-07T02:57:49Z

Hey @ShikaiChen can you run:
make quality
make style
and fix any issues that show up? Thanks!

All cleared!

ShikaiChen · 2025-02-07T09:21:02Z

Hey @ShikaiChen can you run:
make quality
make style
and fix any issues that show up? Thanks!

Due to company regulatory requirements, I need to re-upload the model using my personal account and submit the Pull Request again. Apologies for taking up your time.

natolambert · 2025-02-07T17:51:06Z

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now

ShikaiChen · 2025-02-08T00:08:07Z

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now
I've reopened it, and it's ready to be merged now. Thanks!

ShikaiChen · 2025-02-08T02:10:57Z

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now

Sorry I just modified "model" key in the result json. the The model path in RewardBench Leaderboard needs to be updated.
The previous model location: lenovo/LDL-Reward-Gemma-2-27B-v0.1
The new model location: ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1
If users click on the old link on leaderboard, they will encounter a 404 error.

ShikaiChen · 2025-02-08T02:17:33Z

{
"alpacaeval-easy": 0.99,
"alpacaeval-hard": 0.9473684210526315,
"alpacaeval-length": 0.9368421052631579,
"donotanswer": 0.8161764705882353,
"hep-cpp": 0.9817073170731707,
"hep-go": 0.9817073170731707,
"hep-java": 0.9939024390243902,
"hep-js": 0.9939024390243902,
"hep-python": 0.975609756097561,
"hep-rust": 0.9573170731707317,
"llmbar-adver-GPTInst": 0.9565217391304348,
"llmbar-adver-GPTOut": 0.8297872340425532,
"llmbar-adver-manual": 0.8478260869565217,
"llmbar-adver-neighbor":0.9104477611940298,
"llmbar-natural": 0.94,
"math-prm": 1.0,
"mt-bench-easy": 1.0,
"mt-bench-hard": 0.8648648648648649,
"mt-bench-med": 0.975,
"refusals-dangerous": 0.97,
"refusals-offensive": 0.99,
"xstest-should-refuse": 0.948051948051948,
"xstest-should-respond": 0.964,
"model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1",
"model_type": "Seq. Classifier",
}
{'Chat': 0.9636871508379888, 'Chat Hard': 0.9078947368421053, 'Safety': 0.9378378378378378, 'Reasoning': 0.9903455284552846}

Nothing changed except the "model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1".

ShikaiChen · 2025-02-08T02:18:08Z

@natolambert

natolambert · 2025-02-08T18:11:06Z

Fixing now. No problem @ShikaiChen

ShikaiChen added 3 commits February 5, 2025 03:23

lenovo

9535a18

fix style and quality

69bf752

fix repo name

9c38d95

make quality & make style

1221ccd

ShikaiChen closed this Feb 7, 2025

fix file name

15addc9

ShikaiChen reopened this Feb 7, 2025

natolambert approved these changes Feb 8, 2025

View reviewed changes

natolambert merged commit 95e9ef9 into allenai:main Feb 8, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LDL-Reward-Gemma-2-27B-v0.1 #215

LDL-Reward-Gemma-2-27B-v0.1 #215

ShikaiChen commented Feb 6, 2025 •

edited

Loading

natolambert commented Feb 6, 2025

ShikaiChen commented Feb 7, 2025

ShikaiChen commented Feb 7, 2025

natolambert commented Feb 7, 2025

ShikaiChen commented Feb 8, 2025

ShikaiChen commented Feb 8, 2025 •

edited

Loading

ShikaiChen commented Feb 8, 2025

ShikaiChen commented Feb 8, 2025

natolambert commented Feb 8, 2025

LDL-Reward-Gemma-2-27B-v0.1 #215

LDL-Reward-Gemma-2-27B-v0.1 #215

Conversation

ShikaiChen commented Feb 6, 2025 • edited Loading

natolambert commented Feb 6, 2025

ShikaiChen commented Feb 7, 2025

ShikaiChen commented Feb 7, 2025

natolambert commented Feb 7, 2025

ShikaiChen commented Feb 8, 2025

ShikaiChen commented Feb 8, 2025 • edited Loading

ShikaiChen commented Feb 8, 2025

ShikaiChen commented Feb 8, 2025

natolambert commented Feb 8, 2025

ShikaiChen commented Feb 6, 2025 •

edited

Loading

ShikaiChen commented Feb 8, 2025 •

edited

Loading