-
Notifications
You must be signed in to change notification settings - Fork 24
Issues: bigscience-workshop/evaluation
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Refactor task template to merge multilingual.json and english.json
#64
opened Sep 2, 2021 by
marinecarpuat
Start overleaf for benchmark tech report
documentation
Improvements or additions to documentation
#54
opened Aug 16, 2021 by
epavlick
translate validation prompts into all training languages
multilingual
simple_benchmark
all issues related the simple_benchmark script
#50
opened Aug 12, 2021 by
epavlick
benchmark mt5 on tydiqa prompting setup
simple_benchmark
all issues related the simple_benchmark script
#49
opened Aug 12, 2021 by
epavlick
Convert validation code to work with Megatron as well as huggingface
engineering
#48
opened Aug 12, 2021 by
epavlick
Create Targeted Minimal Pair "Stress-Tests" for Sensitivity to Social Groups
social_impact
Benchmark Tasks for Bias and Social Impact
#38
opened Aug 10, 2021 by
epavlick
Add CrowS-Pairs to Full Benchmark
In-Progress
social_impact
Benchmark Tasks for Bias and Social Impact
#37
opened Aug 10, 2021 by
epavlick
Add Jigsaw Toxicity Classification to Full Benchmark
social_impact
Benchmark Tasks for Bias and Social Impact
#36
opened Aug 10, 2021 by
epavlick
Add WinoMT to Full Benchmark
social_impact
Benchmark Tasks for Bias and Social Impact
#35
opened Aug 10, 2021 by
epavlick
Add HANS to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#34
opened Aug 10, 2021 by
epavlick
Add MNLI to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#33
opened Aug 10, 2021 by
epavlick
Add ANLI to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#32
opened Aug 10, 2021 by
epavlick
Add HuffPo Text Classification to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#31
opened Aug 10, 2021 by
epavlick
Add TyDiQA for non-training languages to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
multilingual
#30
opened Aug 10, 2021 by
epavlick
Add BioASQ to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#29
opened Aug 10, 2021 by
epavlick
Add QASPER to Full Benchmark
few_shot
Benchmark Tasks for Few-Shot Generalization
#28
opened Aug 10, 2021 by
epavlick
Add Edge Probing Suite to Full Benchmark
linguistic_structure
Benchmark Tasks for CoreNLP/linguistic structure prediction
#27
opened Aug 10, 2021 by
epavlick
Add LAMA to Full Benchmark
In-Progress
linguistic_structure
Benchmark Tasks for CoreNLP/linguistic structure prediction
#26
opened Aug 10, 2021 by
epavlick
Add LinCE Testbed to Full Benchmark
In-Progress
linguistic_structure
Benchmark Tasks for CoreNLP/linguistic structure prediction
multilingual
#25
opened Aug 10, 2021 by
epavlick
Add POS Tagging with UD to Full Benchmark
linguistic_structure
Benchmark Tasks for CoreNLP/linguistic structure prediction
multilingual
#24
opened Aug 10, 2021 by
epavlick
Add QA-SRL to Full Benchmark
linguistic_structure
Benchmark Tasks for CoreNLP/linguistic structure prediction
#23
opened Aug 10, 2021 by
epavlick
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.