Spoken Language Identification from Short Utterances

This is a model for identifying the language spoken in a short audio segment.

Installation

To install the required libraries (tested on Ubuntu 17.11) run:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Predicting the language in an audio file

Convert an audio file to a spectrogram:

 python data/dataset_gen.py -z speech.wav -o .

Obtain the prediction using a pre-trained model:

 python main.py --model-dir your-trained-model/ --params your-trained-model/params.json --model combo --predict speech.png

Training the model from scratch

Prepare a dataset:
- Place your spectrograms in a folder
- Create a test set CSV file containing "Filename,Language" pairs
- Create an evaluation set CSV file (same format as the test)

Train the model:

 python main.py --model-dir your-trained-model/ --params your-trained-model/params.json --model combo --image-dir your-data/ --train-set your-data/train-set.csv --eval-set your-data/eval-set.csv

Author

This project was developed by Rimvydas Naktinis during Pi School's AI programme in Fall 2017.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Spoken Language Identification from Short Utterances

Installation

Predicting the language in an audio file

Training the model from scratch

Author

Files

README.md

Latest commit

History

README.md

File metadata and controls

Spoken Language Identification from Short Utterances

Installation

Predicting the language in an audio file

Training the model from scratch

Author