Release v1.3.0 · echogarden-project/echogarden

Enhancements

Accept language codes in multiple formats. Currently supports ISO 639-1 (example: es, es-MX), ISO 639-2 (example: spa), and full English language names (spanish)
Whisper: when no matching language was found, include the exact provided language identifier to reduce confusion about language support

Recognition / alignment / translation: timing for words that overlap with non-speech regions is now truncated based on the voice-activity region where the overlap is greatest. If the processing is done over audio that has been cropped using VAD, it can cause an upcoming word to appear too early, or extend too much, before/after a non-speech region, causing the timing to be inaccurate near the region boundaries. This tries to fix that, by, during the uncropping of the timeline, ensuring that words can only span a single active voice region (selected according to maximum overlap), preventing time ranges to be over-extended
Fix whisper.cpp speech-to-text translation not including word offsets

Remove pico and flite being used as default synthesis engines in some languages (pico is never actually selected and flite uses WASI which appears to have segmentation fault issues in Node versions 20, 21, and 22

Full Changelog: v1.2.1...v1.3.0