Skip to content

Latest commit

 

History

History
49 lines (36 loc) · 1.79 KB

README.md

File metadata and controls

49 lines (36 loc) · 1.79 KB

Injongo Dataset

A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
language = [
    "amh", "ewe", "hau", "ibo", "kin", 
    "lin", "lug", "orm", "sna", "sot", 
    "swa", "twi", "wol", "xho", "yor", "zul"
]

Data Location

The Injongo dataset is available at Masakhane-NLU: Conversation AI and Benchmark datasets for African languages

Raw Data: data/output: csv format for the raw dataset, including logical_form and spans

Item Example: split,domain,intent,text,spans,logical_form test,banking,balance,በ አባይ ባንክ አካውንት ለሶፋ የሚሆን ገንዘብ አለኝ,"2:9:SL:BANK_NAME,17:19:SL:SHOPPING_ITEM",[IN:balance [SL:BANK_NAME አባይ ባንክ] [SL:SHOPPING_ITEM ሶፋ] ]

Package Install

pip install -e .

Additional Dependencies:

More details of code can be explored with numbered jupyter notebooks (*.ipynb).

Environment Variables (.env file)

OPENAI_API_KEY=sk-proj-
GEMINI_API_KEY=ABCD