Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDL-Reward-Gemma-2-27B-v0.1 #215

Merged
merged 5 commits into from
Feb 8, 2025
Merged

LDL-Reward-Gemma-2-27B-v0.1 #215

merged 5 commits into from
Feb 8, 2025

Conversation

ShikaiChen
Copy link
Contributor

@ShikaiChen ShikaiChen commented Feb 6, 2025

Reward Model based on Label Distribution Learning:
ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1

to run it:
python ./scripts/run_rm.py --model ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1 --batch_size 1

The expected results are:

{
"alpacaeval-easy": 0.99,
"alpacaeval-hard": 0.9473684210526315,
"alpacaeval-length": 0.9368421052631579,
"donotanswer": 0.8161764705882353,
"hep-cpp": 0.9817073170731707,
"hep-go": 0.9817073170731707,
"hep-java": 0.9939024390243902,
"hep-js": 0.9939024390243902,
"hep-python": 0.975609756097561,
"hep-rust": 0.9573170731707317,
"llmbar-adver-GPTInst": 0.9565217391304348,
"llmbar-adver-GPTOut": 0.8297872340425532,
"llmbar-adver-manual": 0.8478260869565217,
"llmbar-adver-neighbor":0.9104477611940298,
"llmbar-natural": 0.94,
"math-prm": 1.0,
"mt-bench-easy": 1.0,
"mt-bench-hard": 0.8648648648648649,
"mt-bench-med": 0.975,
"refusals-dangerous": 0.97,
"refusals-offensive": 0.99,
"xstest-should-refuse": 0.948051948051948,
"xstest-should-respond": 0.964,
"model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1",
"model_type": "Seq. Classifier",
}
{'Chat': 0.9636871508379888, 'Chat Hard': 0.9078947368421053, 'Safety': 0.9378378378378378, 'Reasoning': 0.9903455284552846}

@natolambert
Copy link
Collaborator

Hey @ShikaiChen can you run:

make quality
make style

and fix any issues that show up? Thanks!

@ShikaiChen
Copy link
Contributor Author

Hey @ShikaiChen can you run:

make quality
make style

and fix any issues that show up? Thanks!

All cleared!

@ShikaiChen
Copy link
Contributor Author

Hey @ShikaiChen can you run:

make quality
make style

and fix any issues that show up? Thanks!

Due to company regulatory requirements, I need to re-upload the model using my personal account and submit the Pull Request again. Apologies for taking up your time.

@ShikaiChen ShikaiChen closed this Feb 7, 2025
@ShikaiChen ShikaiChen reopened this Feb 7, 2025
@natolambert
Copy link
Collaborator

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now

@ShikaiChen
Copy link
Contributor Author

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now
I've reopened it, and it's ready to be merged now. Thanks!

@natolambert natolambert merged commit 95e9ef9 into allenai:main Feb 8, 2025
5 checks passed
@ShikaiChen
Copy link
Contributor Author

ShikaiChen commented Feb 8, 2025

Thanks @ShikaiChen lmk if its useful to merge this, but will plan on waiting for now

Sorry I just modified "model" key in the result json. the The model path in RewardBench Leaderboard needs to be updated.
The previous model location: lenovo/LDL-Reward-Gemma-2-27B-v0.1
The new model location: ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1
If users click on the old link on leaderboard, they will encounter a 404 error.

@ShikaiChen
Copy link
Contributor Author

{
"alpacaeval-easy": 0.99,
"alpacaeval-hard": 0.9473684210526315,
"alpacaeval-length": 0.9368421052631579,
"donotanswer": 0.8161764705882353,
"hep-cpp": 0.9817073170731707,
"hep-go": 0.9817073170731707,
"hep-java": 0.9939024390243902,
"hep-js": 0.9939024390243902,
"hep-python": 0.975609756097561,
"hep-rust": 0.9573170731707317,
"llmbar-adver-GPTInst": 0.9565217391304348,
"llmbar-adver-GPTOut": 0.8297872340425532,
"llmbar-adver-manual": 0.8478260869565217,
"llmbar-adver-neighbor":0.9104477611940298,
"llmbar-natural": 0.94,
"math-prm": 1.0,
"mt-bench-easy": 1.0,
"mt-bench-hard": 0.8648648648648649,
"mt-bench-med": 0.975,
"refusals-dangerous": 0.97,
"refusals-offensive": 0.99,
"xstest-should-refuse": 0.948051948051948,
"xstest-should-respond": 0.964,
"model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1",
"model_type": "Seq. Classifier",
}
{'Chat': 0.9636871508379888, 'Chat Hard': 0.9078947368421053, 'Safety': 0.9378378378378378, 'Reasoning': 0.9903455284552846}

Nothing changed except the "model": "ShikaiChen/LDL-Reward-Gemma-2-27B-v0.1".

@ShikaiChen
Copy link
Contributor Author

@natolambert

@natolambert
Copy link
Collaborator

Fixing now. No problem @ShikaiChen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants