Releases: ina-foss/inaSpeechSegmenter
Releases · ina-foss/inaSpeechSegmenter
interspeech23
-
final.onnx and raw81.pth are pretrained X-vector Resnet101 architectures, obtained from VBX project (Brno University of Technology)
https://github.com/BUTSpeechFIT/VBx/tree/master/VBx/models/ResNet101_16kHz/nnet
For more details see F. Landini, J. Profant, M. Diez, L. Burget: Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks (arXiv version) -
interspeech2023_all.hdf5 and interspeech2023_cvfr.hdf5 are X-vector MLP gender classification models trained by @simonD3V . This work is described in a study submitted to interspeech 2023 to be described upon acceptance.
models
Classification models used in inaSpeechSegmenter