Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

Open
Huan-phonetic opened this issue Apr 11, 2024 · 2 comments
Open

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

Huan-phonetic opened this issue Apr 11, 2024 · 2 comments

Comments

@Huan-phonetic
Copy link

Dear authors,

Maybe it seems novice but when I tried train.py with my dataset (and also LJ dataset), I found the y variable is having only one frame (1, 513). Any idea why this happens? My audios are longer than 2s at the least.

Traceback (most recent call last):
File "F:\HiFiGAN\hifi-gan\train.py", line 271, in
main()
File "F:\HiFiGAN\hifi-gan\train.py", line 267, in main
train(0, a, h)
File "F:\HiFiGAN\hifi-gan\train.py", line 113, in train
for i, batch in enumerate(train_loader):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1345, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1371, in _process_data
data.reraise()
File "G:\Conda\envs\pytorch\Lib\site-packages\torch_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\worker.py", line 308, in worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
~~~~~~~~~~~~^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 139, in getitem
mel = mel_spectrogram(audio, self.n_fft, self.num_mels,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 69, in mel_spectrogram
spec = torch.matmul(mel_basis[str(fmax)+'
'+str(y.device)], spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513)

@datouggg
Copy link

这是一个版本问题,旧版本的torch.stft返回的张量最后一个维度大小是2,也就是是一个四维的张量,现版本能直接返回一个复数,张量维度是3,那么取幅值就不能用作者代码里的方法,你可以试试abs试试

@chazarnik
Copy link

chazarnik commented Jul 23, 2024

Dear authors,

Maybe it seems novice but when I tried train.py with my dataset (and also LJ dataset), I found the y variable is having only one frame (1, 513). Any idea why this happens? My audios are longer than 2s at the least.

Traceback (most recent call last):
File "F:\HiFiGAN\hifi-gan\train.py", line 271, in
main()
File "F:\HiFiGAN\hifi-gan\train.py", line 267, in main
train(0, a, h)
File "F:\HiFiGAN\hifi-gan\train.py", line 113, in train
for i, batch in enumerate(train_loader):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1345, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1371, in _process_data
data.reraise()
File "G:\Conda\envs\pytorch\Lib\site-packages\torch_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]

File "F:\HiFiGAN\hifi-gan\meldataset.py", line 139, in **getitem**
mel = mel_spectrogram(audio, self.n_fft, self.num_mels,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 69, in mel_spectrogram
spec = torch.matmul(mel_basis[str(fmax)+'_'+str(y.device)], spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513)

Hi, I found the error after fiddling around with the meldataset.py file. It's as @datouggg said but we need to clarify where the exact issue is.

In the mel_spectrogram() function you need to change the way the magnitude spectrogram is being retrieved. The issue stems from using a newer version of torch. The newer version of torch will require in torch.stft() to set retuurn_complex=True. Now the dimensionality of the returned tensor has changed to (num_batches, frequency_bins, temporal_bins) and the tensor contains complex values, in contrast to previous versions where the real and imaginary part were separate dimensions.
Go the line where spec = torch.sqrt(spec.pow(2).sum(-1)+(1e-9)) and change it to spec = torch.abs(spec). This will solve the issue and you can run inference without problems.

Hope that helps and the issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants