Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about inconsistent evaluation result #392

Open
coorful opened this issue Jul 24, 2023 · 0 comments
Open

questions about inconsistent evaluation result #392

coorful opened this issue Jul 24, 2023 · 0 comments

Comments

@coorful
Copy link

coorful commented Jul 24, 2023

Hi,i have used deepspeed framework to train gpt-117M model.
when i evaluate model perfomance on wikitext-103, result by using tasks/eval_harness/evaluate.py vs. first convert checkpoint to megatron format and use tasks/main.py , there exists a large performance gap in PPL...
May I ask what is the reason for this phenomenon? @mayank31398

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant