Is the generated sound visually aligned? #18

sukun1045 · 2021-11-09T17:46:43Z

sukun1045
Nov 9, 2021

Hi,

First of all, congrats and really great work! While there are lots of audio examples, I haven't found any examples with videos so it is hard to tell. Since you have compared with RegNet which claimed to generate Visually Aligned Sound from Videos, I am just curious whether this work can also achieve that. Thank you.

v-iashin · 2021-11-09T18:44:06Z

v-iashin
Nov 9, 2021
Maintainer

Hi. Thanks!

I think there is a little misunderstanding. We focused on generating visually relevant sounds, which does not imply alignment. We selected RegNet because it exhibits alignment and, therefore, relevance making it state of the art for both. RegNet is a great idea and very relevant to our topic, leaving it without comparison would not be decent towards the authors.

Regarding the alignment with SpecVQGAN. I think in general it does not support the alignment and I hope we never claimed that it does. I think adding this ability to the model would make a great contribution and I am looking forward to seeing it!

0 replies

sukun1045 · 2021-11-09T18:56:12Z

sukun1045
Nov 9, 2021
Author

I see. no worry, I am just curious about that and I am trying to identify my own project. Thanks a lot!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the generated sound visually aligned? #18

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Is the generated sound visually aligned? #18

sukun1045 Nov 9, 2021

Replies: 2 comments

v-iashin Nov 9, 2021 Maintainer

sukun1045 Nov 9, 2021 Author

sukun1045
Nov 9, 2021

v-iashin
Nov 9, 2021
Maintainer

sukun1045
Nov 9, 2021
Author