Repeating Benchmark results #23

st-tomic · 2020-04-06T12:37:50Z

I am experimenting with your code and would like to know how to repeat benchmark results from the Table?

Is it the pipeline from readme? With 25epoch and batch size of 32? How many gpu-s did you use (4x8 I guess)?

Dataset should be full librispeech.
Was data augmentation used?

Does the code support decoding on the whole dev-clean subset?

rolczynski · 2020-04-08T07:51:04Z

My primary goal was slightly different. I just wanted to provide the good and open-sourced Polish ASR. I tried to experiment with the Mozilla DeepSpeech, Kaldi, etc. there are several attempts, but well ... They are overcomplicated and too specific for further research. I decided to build this little package from scratch.

OK, and back to the question. To make this package more general, I had to adjust my aim and provide the English model. I plan to train a model from the very beginning, but for now, I adapted the English model from the Seq2Seq repository (here the NVIDIA documentation, and the configuration file where you can find detailed information here and the my model adaptation file - It should be compatible what we have here)

I do not want to stuck with CTC based models. In the next months, I will do the second version of this package, where I introduce the Transformer based English ASR (I am quite fascinated about NLP in general, check out my new repo: Aspect Based Sentiment Analysis).

ps. The presented result is for the greedy decoder. In my opinion, the sophisticated decoding algorithms are old-fashioned, crude... isn't it? ;)

st-tomic · 2020-04-16T09:33:14Z

Hi @rolczynski,

Thanks for the feedback and interesting info. I agree that we should look at the wider image also :)

I am looking forward to seeing your future work.

Best regards.

rolczynski added the question Further information is requested label Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeating Benchmark results #23

Repeating Benchmark results #23

st-tomic commented Apr 6, 2020

rolczynski commented Apr 8, 2020 •

edited

Loading

st-tomic commented Apr 16, 2020

Repeating Benchmark results #23

Repeating Benchmark results #23

Comments

st-tomic commented Apr 6, 2020

rolczynski commented Apr 8, 2020 • edited Loading

st-tomic commented Apr 16, 2020

rolczynski commented Apr 8, 2020 •

edited

Loading