You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been using WhisperX and noticed something odd about the logs. Sometimes, the model gives percentages like 55% during the process, but other times, it doesn't. Here are two examples of the log outputs:
Example 1
No language specified, the model will first detect the language for each audio file (which slows down the inference time).
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.1. To make this upgrade permanent, run `python -m pytorch_lightning.utilities.upgrade_checkpoint ../root/.cache/torch/whisperx-vad-segmentation.bin`.
The model was trained with pyannote.audio 0.0.1, but yours is 3.0.1. If you don't revert pyannote.audio to 0.x, something might go wrong.
The model was trained with torch 1.10.0+cu102, but yours is 2.1.0+cu121. If you don't revert torch to 1.x, something might go wrong.
Detected language: en (1.00) in the first 30 seconds of the audio...
Example 2
100.0%
100.0%
100.0%
100.0%
100.0%
No language specified, the model will first detect the language for each audio file (which slows down the inference time).
The model was trained with pyannote.audio 0.0.1, but yours is 3.1.1. If you don't revert pyannote.audio to 0.x, something might go wrong.
The model was trained with torch 1.10.0+cu102, but yours is 2.1.0+cu121. If you don't revert torch to 1.x, something might go wrong.
It took 1876.43 milliseconds to load the model, 2004.21 milliseconds to load the audio, and 10330.61 milliseconds to transcribe it. It also took 12245.40 milliseconds to align the output.
The maximum amount of GPU memory allocated over runtime was 3.61 GB.
Key Observations
Example 1: No percentage updates during the transcription process
Example 2: Several lines of 100.0% logs appear before the final results are shown
Both examples indicate no language was specified and required detection, but their behaviors differ
Questions and Hypotheses
I'm curious to know:
Why do the logs sometimes show percentages (e.g., 100.0%) and sometimes not?
Observation with WhisperX Logs
I've been using WhisperX and noticed something odd about the logs. Sometimes, the model gives percentages like 55% during the process, but other times, it doesn't. Here are two examples of the log outputs:
Example 1
Example 2
Key Observations
Questions and Hypotheses
I'm curious to know:
Steps to Reproduce
To investigate further:
Any help appreciated @victor-upmeet
The text was updated successfully, but these errors were encountered: