mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

Huan-phonetic · 2024-04-11T10:52:03Z

Dear authors,

Maybe it seems novice but when I tried train.py with my dataset (and also LJ dataset), I found the y variable is having only one frame (1, 513). Any idea why this happens? My audios are longer than 2s at the least.

Traceback (most recent call last):
File "F:\HiFiGAN\hifi-gan\train.py", line 271, in
main()
File "F:\HiFiGAN\hifi-gan\train.py", line 267, in main
train(0, a, h)
File "F:\HiFiGAN\hifi-gan\train.py", line 113, in train
for i, batch in enumerate(train_loader):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1345, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1371, in _process_data
data.reraise()
File "G:\Conda\envs\pytorch\Lib\site-packages\torch_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\worker.py", line 308, in worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
~~~~~~~~~~~~^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 139, in getitem
mel = mel_spectrogram(audio, self.n_fft, self.num_mels,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 69, in mel_spectrogram
spec = torch.matmul(mel_basis[str(fmax)+''+str(y.device)], spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513)

datouggg · 2024-07-22T03:05:05Z

这是一个版本问题，旧版本的torch.stft返回的张量最后一个维度大小是2，也就是是一个四维的张量，现版本能直接返回一个复数，张量维度是3，那么取幅值就不能用作者代码里的方法，你可以试试abs试试

chazarnik · 2024-07-23T12:19:35Z

Dear authors,

Maybe it seems novice but when I tried train.py with my dataset (and also LJ dataset), I found the y variable is having only one frame (1, 513). Any idea why this happens? My audios are longer than 2s at the least.
Traceback (most recent call last):
File "F:\HiFiGAN\hifi-gan\train.py", line 271, in
main()
File "F:\HiFiGAN\hifi-gan\train.py", line 267, in main
train(0, a, h)
File "F:\HiFiGAN\hifi-gan\train.py", line 113, in train
for i, batch in enumerate(train_loader):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1345, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data\dataloader.py", line 1371, in _process_data
data.reraise()
File "G:\Conda\envs\pytorch\Lib\site-packages\torch_utils.py", line 644, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Conda\envs\pytorch\Lib\site-packages\torch\utils\data_utils\fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 139, in **getitem**
mel = mel_spectrogram(audio, self.n_fft, self.num_mels,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\HiFiGAN\hifi-gan\meldataset.py", line 69, in mel_spectrogram
spec = torch.matmul(mel_basis[str(fmax)+'_'+str(y.device)], spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513)

Hi, I found the error after fiddling around with the meldataset.py file. It's as @datouggg said but we need to clarify where the exact issue is.

In the mel_spectrogram() function you need to change the way the magnitude spectrogram is being retrieved. The issue stems from using a newer version of torch. The newer version of torch will require in torch.stft() to set retuurn_complex=True. Now the dimensionality of the returned tensor has changed to (num_batches, frequency_bins, temporal_bins) and the tensor contains complex values, in contrast to previous versions where the real and imaginary part were separate dimensions.
Go the line where spec = torch.sqrt(spec.pow(2).sum(-1)+(1e-9)) and change it to spec = torch.abs(spec). This will solve the issue and you can run inference without problems.

Hope that helps and the issue can be closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

Huan-phonetic commented Apr 11, 2024

datouggg commented Jul 22, 2024

chazarnik commented Jul 23, 2024 •

edited

Loading

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

mat1 and mat2 shapes cannot be multiplied (80x513 and 1x513) #162

Comments

Huan-phonetic commented Apr 11, 2024

datouggg commented Jul 22, 2024

chazarnik commented Jul 23, 2024 • edited Loading

chazarnik commented Jul 23, 2024 •

edited

Loading