Question related to how to use the validation and training splits. #2423

sorobedio · 2024-10-24T01:20:48Z

Hello, I would like to know how I can use the validation split to evaluate the models, and similarly, how to use the training split for evaluation if needed. I haven't found an option where the user can specify the dataset split they want to use for model evaluation. Could you provide guidance on how to set this up?
thank you

baberabb · 2024-10-24T21:19:34Z

Hi! You can switch up the sets used in the task yamls. We use the test split if provided, otherwise the validation split is used. Example:

lm-evaluation-harness/lm_eval/tasks/arc/arc_easy.yaml

Lines 7 to 9 in 1185e89

    
           training_split: train 
        
           validation_split: validation 
        
           test_split: test

If a fewshot_split is not provided, then the priority is training > val > test to extract the fewshot examples.

sorobedio · 2024-10-25T05:14:53Z

I see. So the validation set is never used when both the training and test sets are present. Is the Open LM leaderboard following the same approach?
Thank you.

baberabb · 2024-10-25T08:30:06Z

I see. So the validation set is never used when both the training and test sets are present. Is the Open LM leaderboard following the same approach? Thank you.

yes!

baberabb added the asking questions For asking for clarification / support on library usage. label Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question related to how to use the validation and training splits. #2423

Question related to how to use the validation and training splits. #2423

sorobedio commented Oct 24, 2024

baberabb commented Oct 24, 2024 •

edited

Loading

sorobedio commented Oct 25, 2024

baberabb commented Oct 25, 2024

Question related to how to use the validation and training splits. #2423

Question related to how to use the validation and training splits. #2423

Comments

sorobedio commented Oct 24, 2024

baberabb commented Oct 24, 2024 • edited Loading

sorobedio commented Oct 25, 2024

baberabb commented Oct 25, 2024

baberabb commented Oct 24, 2024 •

edited

Loading