Releases · ina-foss/inaSpeechSegmenter · GitHub

30 Mar 13:58

DavidDoukhan

interspeech23 Latest

Latest

final.onnx and raw81.pth are pretrained X-vector Resnet101 architectures, obtained from VBX project (Brno University of Technology)
https://github.com/BUTSpeechFIT/VBx/tree/master/VBx/models/ResNet101_16kHz/nnet
For more details see F. Landini, J. Profant, M. Diez, L. Burget: Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks (arXiv version)
interspeech2023_all.hdf5 and interspeech2023_cvfr.hdf5 are X-vector MLP gender classification models trained by @simonD3V . This work is described in a study submitted to interspeech 2023 to be described upon acceptance.

Contributors

simonD3V

Assets 6

13 Feb 13:46

DavidDoukhan

models

Classification models used in inaSpeechSegmenter

Assets 5