Shared Tasks for Rapid, Efficient Analysis of Many Languages in Emerging Documentation.
Research @ University of Washington.
Contact Emily Ahn with questions: eahn [at] uw [dot] edu
- python 3
- dscore
- java JDK 1.7 or 1.8
Our data originates from the Endangered Languages Archive (ELAR).
Selected languages for this task span a wide range of language families and typological groups.
- Sakun
- Cicipu
- Effutu
- Create an online account profile (free here)
- Login and downlaod your cookies as a txt file (browser extensions can handle this well). Note: If you are downloading across different days (or different sessions), you may need to re-download your cookies.
- Run our script to curl (download) the data:
scripts/download_elar.py
- elar_{lang}_links.tsv (to be used by script when downloading files from ELAR)
- .uem file (Un-partitioned Evaluation Map--determines the regions to be analyzed in each recording; see description)
- ref/ (rttm files)
Who spoke when, and where else did they speak again? This task takes raw audio as input and attempts to detect speech and cluster groups of speech from the same speaker together under one label.
We use the lightweight system from LIUM that uses ILP clustering techniques.
Download the code from that repository and follow their installation instructions.
If you are using JDK 1.8, replace their jar file in the LIUM/
folder with the jar found in this repository (then rename it or change its call from their ilp_diarization2.sh
script: baseline/diar/lium-diarization-200129.jar
(compiled on Jan 29, 2020).
Instructions to compile this JDK 1.8 compatible version on your own machine are here.
We provide a script to convert LIUM output into rttm format: scripts/lium_to_rttm.py
Assuming you have your system output as .rttm files in the folder data/{lang}/sys/
, run dscore on this folder with the data/{lang}/ref/
folder, and output to data/{lang}_dev.stdout
.
dscore/score.py -r data/{lang}/ref/*.rttm -s data/{lang}/sys/*.rttm > data/{lang}/{lang}_dev.stdout 2> ccp/{lang}_dev.stderr -u data/{lang}/{lang}.uem
Language | DER |
---|---|
Cicipu | 44.54 |
Effutu | 34.65 |
Sakun | 62.55 |
TODO: update numbers for only DEV set
The DIHARD Challenge I (2018, site) and Challenge II (2019, site, paper) have focused on robust speaker diarization. Their second challenge baseline involes the Kaldi toolkit.
Possibly: Speaker Identification, Genre Identification