Skip to content

Commit

Permalink
updated New-EVAL folder and evaluation notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
Nkluge-correa committed Apr 23, 2024
1 parent 0d4e15f commit 6b0b3d7
Show file tree
Hide file tree
Showing 13 changed files with 2 additions and 762 deletions.
2 changes: 1 addition & 1 deletion Evaluation/New-EVAL/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ We performed the following evaluations using a [Portuguese implementation of the

- [HateBR](https://arxiv.org/abs/2103.14972) (25-shot) - HateBR is the first large-scale expert annotated dataset of Brazilian Instagram comments for abusive language detection on the web and social media. The HateBR was collected from politicians' Brazilian Instagram comments and manually annotated by specialists. It comprises 7,000 documents annotated with a binary classification (offensive versus non-offensive comments). - Data sources: [[1]](https://huggingface.co/datasets/eduagarcia/portuguese_benchmark), [[2]](https://github.com/franciellevargas/HateBR), [[3]](https://huggingface.co/datasets/ruanchaves/hatebr). Metric - F1-macro.

The notebook used to run these evaluations is the [`lm-evaluation-harness-pt-br.ipynb`](./lm-evaluation-harness-pt-br.ipynb). Available on Colab. Full results are stored in the [results folder](./results/).
The notebook used to run these evaluations is the [`lm-evaluation-harness-pt-br.ipynb`](./lm-evaluation-harness-pt-br.ipynb). Available on Colab.

<a href="https://colab.research.google.com/drive/1m6Oqey4P9ShYTO62yRq7wrM_eEsvFJ9D" target="_blank">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
Expand Down
3 changes: 1 addition & 2 deletions Evaluation/New-EVAL/lm-evaluation-harness-pt-br.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,7 @@
"!cd lm-evaluation-harness-pt && python lm_eval \\\n",
" --model huggingface \\\n",
" --model_args pretrained=\"nicholasKluge/TeenyTinyLlama-160m\",revision=\"main\" \\\n",
" --tasks \"assin2_rte,assin2_sts,bluex,enem_challenge,faquad_nli,hatebr_offensive,oab_exams\" \\\n",
" --num_fewshot \"15,15,3,3,15,25,3\" \\\n",
" --tasks \"assin2_rte,assin2_sts,bluex,enem_challenge,faquad_nli,hatebr_offensive,oab_exams,portuguese_hate_speech,tweetsentbr\" \\\n",
" --batch_size \"auto\"\n",
" --device cuda:0 \\\n",
" --output_path \"./\""
Expand Down
69 changes: 0 additions & 69 deletions Evaluation/New-EVAL/results/Bloom-560m.md

This file was deleted.

69 changes: 0 additions & 69 deletions Evaluation/New-EVAL/results/GPT-2.md

This file was deleted.

Loading

0 comments on commit 6b0b3d7

Please sign in to comment.