A LeapfrogAI API-compatible faster-whisper wrapper for audio transcription inferencing across CPU & GPU infrastructures.
See the LeapfrogAI documentation website for system requirements and dependencies.
- LeapfrogAI API for a fully RESTful application
See the Deployment section for the CTranslate2 command for pulling and converting a model for inferencing.
To build and deploy the whisper backend Zarf package into an existing UDS Kubernetes cluster:
Important
Execute the following commands from the root of the LeapfrogAI repository
pip install 'ctranslate2' # Used to download and convert the model weights
pip install 'transformers[torch]' # Used to download and convert the model weights
make build-whisper LOCAL_VERSION=dev
uds zarf package deploy packages/whisper/zarf-package-whisper-*-dev.tar.zst --confirm
To run the vllm backend locally without K8s (starting from the root directory of the repository):
# Install dev and runtime dependencies
make install
# Download and convert model
# Change the MODEL_NAME to change the whisper base
export MODEL_NAME=openai/whisper-base
make download-model
# Start the model backend
make dev