Skip to content
View ZhikangNiu's full-sized avatar
🎯
focus
🎯
focus

Block or report ZhikangNiu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。

Python 180 32 Updated Feb 1, 2025

F5-TTS 推理加速,速度提升约4倍!

Python 44 4 Updated Jan 6, 2025

The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Jupyter Notebook 54 3 Updated Jan 10, 2025

TerDiT: Ternary Diffusion Models with Transformers

Python 68 3 Updated Jun 17, 2024

Scaling Diffusion Transformers with Mixture of Experts

Python 252 11 Updated Sep 9, 2024

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 1,894 115 Updated Feb 15, 2025

微信视频号下载器

Go 942 137 Updated Feb 15, 2025

Alignment files of LibriTTS.

61 7 Updated Mar 16, 2020

Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".

Python 132 6 Updated Jan 9, 2025

Minimal implementation of scalable rectified flow transformers, based on SD3's approach

Jupyter Notebook 478 44 Updated Jul 1, 2024

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Jupyter Notebook 639 58 Updated Jan 27, 2025

制作懂人情世故的大语言模型 | 涵盖提示词工程、RAG、Agent、LLM微调教程

Python 1,131 86 Updated Jan 18, 2025

LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

55 1 Updated Dec 28, 2024

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 681 70 Updated Dec 11, 2024

A curated list of reinforcement learning with human feedback resources (continually updated)

3,706 230 Updated Jan 27, 2025

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 628 48 Updated Feb 14, 2025

phoneme tokenizer and grapheme-to-phoneme model for 8k languages

Python 153 15 Updated Jun 9, 2023

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 685 101 Updated Feb 15, 2025

A python package for deep multilingual punctuation prediction.

Python 115 27 Updated Aug 21, 2024

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,749 166 Updated Jan 22, 2025

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,744 185 Updated Feb 4, 2025

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 867 111 Updated Feb 4, 2025

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 847 48 Updated Jan 25, 2025

Pushing the Limits of Zero-shot End-to-End Speech Translation

Python 25 3 Updated Dec 12, 2024

[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation

Python 27 5 Updated Sep 6, 2024

A curated list of resources for using LLMs to develop more competitive grant applications.

Python 3,487 449 Updated Mar 1, 2024

Train a 1B LLM with 1T tokens from scratch by personal

Python 525 59 Updated Feb 13, 2025

从零实现一个小参数量中文大语言模型。

Python 458 56 Updated Aug 22, 2024

A fast multimodal LLM for real-time voice

Python 3,492 242 Updated Feb 14, 2025

词语拼音数据

Python 467 100 Updated Jan 12, 2025
Next
Showing results