Skip to content

Lacuna funded project to create benchmark dataset for AfricaNLI, AfricaMGSM, AfricaMMLU, and cultural relevant intent classificationand slot filling detection datasets

License

Notifications You must be signed in to change notification settings

masakhane-io/masakhane-nlu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Masakhane-NLU: Conversation AI and Benchmark datasets for African languages

This repository contains the code/data for IrokoBench and InjongoIntent projects:

  • IrokoBench: a human-translated benchmark dataset for 16 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based QA~(AfriMMLU). The datasets are also available HuggingFace

  • InjongoIntent: consist of slot-filling and intent detection dataset for 16 African languages covering various domains such as banking (e.g. pay bill), home (e.g. play music), kitchen and dining (e.g. confirm reservation), travel (e.g. plug type), and utility (e.g. make call). We collected 3,200 sentences per language. The Dataset paper with full details will be released later.

The project has been generously funded by Lacuna Fund

About

Lacuna funded project to create benchmark dataset for AfricaNLI, AfricaMGSM, AfricaMMLU, and cultural relevant intent classificationand slot filling detection datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published