-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tts: add speaker file support #12048
base: master
Are you sure you want to change the base?
Conversation
ea8711d
to
bf3f5ee
Compare
@edwko Could you please have a look on this PR? |
@ngxson @dm4 Looks good! Just a couple of thoughts, this would handle only v0.2 it might make sense to do this more dynamically, maybe add versioning logic similar to this PR #11287 Maybe get version from // Something like this:
double get_speaker_version(json speaker) {
if (speaker.contains("version")) {
return speaker["version"].get<double>();
}
// Also could get version from model itself
// if (common_get_builtin_chat_template(model) == "outetts-0.3") {
// return 0.3;
// }
return 0.2;
}
static std::string audio_text_from_speaker(json speaker) {
std::string audio_text = "<|text_start|>";
double version = get_speaker_version(speaker);
if (version <= 0.3) {
std::string separator = (version == 0.3) ? "<|space|>" : "<|text_sep|>";
for (const auto &word : speaker["words"])
audio_text += word["word"].get<std::string>() + separator;
}
else if (version > 0.3) {
// Future version support could be added here
}
return audio_text;
}
// static std::string audio_data_from_speaker(json speaker) would also need some adjustments to support different versions. |
Signed-off-by: dm4 <[email protected]>
57c3835
to
888f57e
Compare
Hello @ngxson and @edwko, I have already added support for version 0.3. Since |
888f57e
to
986ade7
Compare
--tts-speaker-file
to specify the file path.tts.cpp
to load and parse speaker data, enhancing audio generation capabilities.