WeNet 3.0.0
❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
New Features
- Support full part of bestrq #1869, #2060
- Support GPU Tlg streaming #1878
- Support streaming ASR web demo #1888
- Support k2 rnnt loss and delay penality #1909
- Supports context biasing #1931, #1936
- Support ZeroPrompt (not merged) #1943
- Support M1 Mac onnxruntime #1953
- Support ITN runtime #2001, #2042, #2246
- Support wav2vec2 #2034, #2035
- Support part of w2vbert training #2039
- wenet cli #2047, #2054, #2075, #2082, #2088, #2087, #2098, #2101, #2122 (!! simple and fast !!) 🛫
- Support E-Branchformer module #2013
- Support deepspeed #1849, #2168, #2123 (!! big big big !!) 💯
- LoRA support (not merged) #2049
- support batch decoding for ctc_prefix_beam_search & attention_rescoring #2059 (!! simple and fast !!) 🛫
- support ali-paraformer #2067, #2078, #2093, #2096, #2099, #2124, #2139, #2140, #2155, #2219, #2222, #2277, #2282, #2289, #2314, #2324
- support Contrastive learning for unified models #2100
- support context biasing with ac automaton #2128, #2136
- support whisper arch #2141, #2157, #2196, #2313, #2322, #2323
- Support gradient checkpointing for Conformer & Transformer (whisper) #2173, #2275
- ssh-launcher for multi-node multi-gpu training #2180, #2265
- u2++-lite training support #2202
- support blank penalty #2278
- support speaker in dataset #2292
- Whisper inference support in cpp runtime #2320
What's Changed
- Upgrade libtorch CPU runtime with IPEX version #1893
- Refine ctc alignment #1966
- Use torchrun for distributed training #2020, #2021
- Refine traning code #2055, #2103, #2123, #2248, #2252, #2253, #2270, #2286, #2288, #2312 (!! big changes !!) 🚀
- mv all ctc functions to ctc_utils.py #2057 (!! big changes !!) 🚀
- move search methods to search.py #2056 (!! big changes !!) 🚀
- move all k2 related functions to k2 #2058
- refactor and simplify decoding methods #2061, #2062
- unify decode results of all decoding methods #2063
- refactor(dataset): return dict instead of tuple #2106, #2111
- init_model API changed #2116, #2216 (!! big changes !!) 🚀
- move yaml saving to save_model() #2156
- refine tokenizer #2165, #2186 (!! big changes !!) 🚀
- deprecate wenetruntime #2194 (!! big changes !!) 🚀
- use pre-commit to auto check and lint #2195
- refactor(yaml): Config ctc/cmvn/tokenizer in train.yaml #2205, #2229, #2230, #2227, #2232 (!! big changes !!) 🚀
- train with dict input #2242, #2243 (!! big changes !!) 🚀
- [dataset] keep pcm for other task #2268
- Updgrad torch to 2.x #2301 (!! big changes !!) 🚀
- log everything to tensorboard #2307
New Bug Fixes
- Fix NST recipe #1863
- Fix Librispeech fst dict #1929
- Fix bug when make shard.list for *.flac #1933
- Fix bug of transducer #1940
- Avoid problem during model averaging when there is parameter-tying. #2113
- [loss] set zero_infinity=True to ignore NaN or inf ctc_loss #2299
- fix android #2303
❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
Many thanks to all the contributors !!!!! I love u all.