I get a padding error and I've tried reducing the audio length to a few seconds to no avail #600

MotorCityCobra · 2024-05-23T02:31:20Z

🐛 Bug Report

I've tried to make a script as simple as possible to isolate vocals

To Reproduce

import torch
import torchaudio
from demucs.audio import AudioFile, save_audio
from demucs.apply import apply_model
from demucs.pretrained import get_model

def separate_vocals(track_path, output_path):
    # Load the pretrained model
    model = get_model('955717e8')
    model.eval()
    if torch.cuda.is_available():
        model.to('cuda')
    
    # Load audio
    audio = AudioFile(track_path).read(streams=0, samplerate=model.samplerate, channels=model.audio_channels)
    
    # Normalize audio
    mean = audio.mean(0, keepdim=True)
    std = audio.std(0, keepdim=True)
    audio = (audio - mean) / std
    
    # Apply the model
    with torch.no_grad():
        sources = apply_model(model, audio[None].cuda(), shifts=0)
    
    # Rescale back the output
    sources = sources * std.cuda() + mean.cuda()
    
    # Save only the vocals
    save_audio(sources[0][0], output_path, samplerate=model.samplerate)  # Assuming the first source is vocals

# Example usage
track_path = 'C:/Users/ooo/tor/rvc-data-prep/k_isolate/input/k_and_j.mp3'
output_path = 'C:/Users/ooo/tor/rvc-data-prep/k_isolate/output/vocals.wav'
separate_vocals(track_path, output_path)

Expected behavior

I expect vocals.wav to be audio output with just the vocals from the original audio. Or any file output

Actual Behavior

No file is output because I get this error...

(iso_vocals) C:\Users\ooo\tor\rvc-data-prep>python iso_simple.py
Traceback (most recent call last):
  File "C:\Users\ooo\tor\rvc-data-prep\iso_simple.py", line 35, in <module>
    separate_vocals(track_path, output_path)
  File "C:\Users\ooo\tor\rvc-data-prep\iso_simple.py", line 24, in separate_vocals
    sources = apply_model(model, audio[None].cuda(), shifts=0)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\apply.py", line 250, in apply_model
    chunk_out = future.result()
                ^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\utils.py", line 129, in result
    return self.func(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\apply.py", line 271, in apply_model
    out = model(padded_mix)
          ^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\htdemucs.py", line 538, in forward
    z = self._spec(mix)
        ^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\htdemucs.py", line 435, in _spec
    x = pad1d(x, (pad, pad + le * hl - x.shape[-1]), mode="reflect")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ooo\tor\iso_vocals\Lib\site-packages\demucs\hdemucs.py", line 39, in pad1d
    assert (out[..., padding_left: padding_left + length] == x0).all()
AssertionError

Your Environment

Pytorch on CUDA 12.4 with cuda.is_available() returning True
The audio file is and mp3. 44100, 320, 2 channel

torch.version
'2.4.0.dev20240521+cu124'

The text was updated successfully, but these errors were encountered:

CarlGao4 · 2024-05-23T03:05:29Z

Please use torch < 2.2
The latest version you can use is 2.1.2

MotorCityCobra · 2024-05-23T13:57:42Z

My Torch version... I should have included in the original post


>>> import torch
>>> torch.__version__
'2.4.0.dev20240521+cu124'
>>>

Somehow I was able to get it to work with the older model only by calling the module from the commandline.

python -m demucs.separate -n mdx_extra_q c:/path/to/my/audio.mp3

But I think this is using the same version of torch. Has to be.

MotorCityCobra added the bug Something isn't working label May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I get a padding error and I've tried reducing the audio length to a few seconds to no avail #600

I get a padding error and I've tried reducing the audio length to a few seconds to no avail #600

MotorCityCobra commented May 23, 2024 •

edited

Loading

CarlGao4 commented May 23, 2024

MotorCityCobra commented May 23, 2024 •

edited

Loading

I get a padding error and I've tried reducing the audio length to a few seconds to no avail #600

I get a padding error and I've tried reducing the audio length to a few seconds to no avail #600

Comments

MotorCityCobra commented May 23, 2024 • edited Loading

🐛 Bug Report

To Reproduce

Expected behavior

Actual Behavior

Your Environment

CarlGao4 commented May 23, 2024

MotorCityCobra commented May 23, 2024 • edited Loading

MotorCityCobra commented May 23, 2024 •

edited

Loading

MotorCityCobra commented May 23, 2024 •

edited

Loading