LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

55 1 Updated Dec 28, 2024

facebookresearch / SONAR

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 681 70 Updated Dec 11, 2024

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,706 230 Updated Jan 27, 2025

FunAudioLLM / InspireMusic

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 628 48 Updated Feb 14, 2025

xinjli / transphone

phoneme tokenizer and grapheme-to-phoneme model for 8k languages

Python 153 15 Updated Jun 9, 2023

jitsi / jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 685 101 Updated Feb 15, 2025

oliverguhr / deepmultilingualpunctuation

A python package for deep multilingual punctuation prediction.

Python 115 27 Updated Aug 21, 2024

InternLM / InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,749 166 Updated Jan 22, 2025

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,744 185 Updated Feb 4, 2025

shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 867 111 Updated Feb 4, 2025

segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 847 48 Updated Jan 25, 2025

mt-upc / ZeroSwot

Pushing the Limits of Zero-shot End-to-End Speech Translation

Python 25 3 Updated Dec 12, 2024

choijeongsoo / utut

[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation

Python 27 5 Updated Sep 6, 2024

eseckel / ai-for-grant-writing

A curated list of resources for using LLMs to develop more competitive grant applications.

Python 3,487 449 Updated Mar 1, 2024

zhanshijinwat / Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

Python 525 59 Updated Feb 13, 2025

wdndev / tiny-llm-zh

从零实现一个小参数量中文大语言模型。

Python 458 56 Updated Aug 22, 2024

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 3,492 242 Updated Feb 14, 2025

mozillazg / phrase-pinyin-data

词语拼音数据

Python 467 100 Updated Jan 12, 2025

Zhikang Niu ZhikangNiu

Lists (19)

ASR

Awesome List

Chinese LLM

Codec

CV

Dataset/Tools/Course

Diffusion

emotion

Framework

LLM

Music Generation

nlp

other

PyTorch

RLHF

s2st

TTS

tutorial

Vocoder

Stars