GitHub - levante-framework/levante_translations: Generation of audio assets

Levante Audio Tools

Our audio tools include two main utilities:

generate_speech.py: Designed to generate audio files for one or more languages in the voices specified in config.py. The audio files are laid out in a filesystem format that matches that needed for core assets and our GCP buckets.

dashboard.py: This standalone utility does four things:

Shows current audio generation stats in the top frame
Shows all our current translations and audio by language in the bottom frame. Selecting one will play it.
For evaluation purposes, allows selecting a voice from PlayHt or ElevenLabs to play the same text.
Allows the addition of SSML tags to an edit box to evaluate their effect.

Installing Levante-Audio-Tools

Open a terminal:

Create a directory to use for your levante projects

Then: git clone https://github.com/levante-framework/levante_translations.git

[If you've already cloned it, use "git pull"]

[Change into the folder with the project:] cd levante_translations

[For stable behavior, use the main branch] git checkout main

[Install all the needed packages:] pip (or pip3) install . --user

[Add PlayHt credentials to your enviornment] [For Levante team, credentials are in Slack]

For Mac, edit ~/.zshrc and use: export PLAY_DOT_HT_API_KEY=<API_KEY> export PLAY_DOT_HT_USER_ID=<USER_ID> [then exit the editor and do "source ~/.zshrc"]

You may also need to install ffmpeg to hear some of the audio.

[Hopefully the Dashboard will now run:] python (or py) dashboard.py

Generating Audio Files

Create/Update item_bank_translations.csv with the translations you'd like to use
Depending on your desired language, run: generate_english.[sh|bat] generate_spanish.[sh|bat] generate_german.[sh|bat]
By default the generated audio files will be in the audio_files sub-directory, in the format used for the asset repo and for serving
Optionally push/merge the audio files to the asset repo, and/or sync them to the appropriate google bucket using 'gsutil rsync -r '

Code Flow:

Batch/Shell files call generate_speech.py with the appropriate language code and voice.

(CURRENTLY only PlayHt is supported. We'll add ElevenLabs if we decide we want to use any of their voices)

generate_speech.py compares the desired text with its persistent cache of what it has already generated audio for. If a string is new or changed, it is placed in 'needed_item_bank_translation.csv'

Items with no assigned task are skipped, as there is nowhere to file them.

The translations needed are passed to PlayHT/playHt_tts.py

The module iterates through the rows in the csv, requesting audio generation for each.

As needed, the module will wait for a status of completed.

It also restarts the request if it receives an error. Currently it will do that 5 times before giving up.

Error Handling

Errors aren't a problem for English and Spanish, but happen for German and French. There doesn't seem to be a pattern, but it means that sometimes the batch/shell file has to be re-run. After a couple/few runs, everything gets translated.

There is a helper script count_audio.[bat|sh] that counts the number of audio files generated for each language, as a sanity check.

Resetting audio transcriptions

To change to a new voice or if for some other reason you want to redo transcriptions for a specific language, simple set the appropriate language column to None. You can do this by importing as a DataFrame and then just using a column operation. [At some point we should make this a function]

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
ELabs		ELabs
Levante_Audio_Tools.egg-info		Levante_Audio_Tools.egg-info
PlayHt		PlayHt
__pycache__		__pycache__
audio_files		audio_files
build/dashboard		build/dashboard
dashboard/Translation-Dashboard		dashboard/Translation-Dashboard
dist		dist
filtering		filtering
google		google
legacy		legacy
scripts		scripts
source		source
utilities		utilities
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
audio-generation.code-workspace		audio-generation.code-workspace
crowdin.yml		crowdin.yml
dashboard.py		dashboard.py
dashboard.spec		dashboard.spec
generate_speech.py		generate_speech.py
levante_translations.code-workspace		levante_translations.code-workspace
requirements.txt		requirements.txt
setup.py		setup.py
stats.csv		stats.csv
tools.code-workspace		tools.code-workspace
translated_fixed.csv		translated_fixed.csv
translation_master.csv		translation_master.csv
voices.txt		voices.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Levante Audio Tools

Installing Levante-Audio-Tools

Generating Audio Files

Code Flow:

Error Handling

Resetting audio transcriptions

About

Releases

Packages

Languages

License

levante-framework/levante_translations

Folders and files

Latest commit

History

Repository files navigation

Levante Audio Tools

Installing Levante-Audio-Tools

Generating Audio Files

Code Flow:

Error Handling

Resetting audio transcriptions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages