You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for publically making available the great work you have done.
I have been trying to reproduce the results for the task "video inpainting using a sequence of masks". More specifically, I have a video including 10 frames and 10 masks corresponding to those 10 frames of videos. So, I would like to feed the video alongside the sequence of masks and a text prompt to the model. So, I expect to get a temporally consistent video as an output in a way that the output video adheres to the sequence of masks and the text prompt.
However, I could not see any argument for the input mask. So, I went through the code, and as far as I understood, it seems that the code itself generates a random mask on the input video. The code below (inference_single.py) shows my explanation:
in which the function "make_masked_images" is:
So, as far as I realized, this "mask" variable in line 564 of the first snapshot is initialized with the "batch" variable (which comes from the dataloader) in the picture below:
So, when I went through the "dataset.py" code, I found out the mask is somehow randomly generated as the following:
So, my understanding is that the code only conditions the model on this randomly generated mask. So, if my understanding is correct, does it mean that we cannot feed an external sequence of masks to the model? If the understanding is not correct, I would appreciate it if you could explain how I can feed the sequence of masks to the model as I could not find anything in the code.
Thank you in advance for putting time into this case.
Kind Regards,
Amir
The text was updated successfully, but these errors were encountered:
I have tried to support customized mask sequence by modifying the implementation of __getitem__ in VideoDataset.
However, I observe that the make_masked_images is some kind weird.
defmake_masked_images(imgs, masks):
masked_imgs= []
fori, maskinenumerate(masks):
# concatenationmasked_imgs.append(torch.cat([imgs[i] * (1-mask), (1-mask)], dim=1))
returntorch.stack(masked_imgs, dim=0)
# line 562-564if'mask'incfg.video_compositions:
masked_video=make_masked_images(misc_data.sub(0.5).div_(0.5), mask)
masked_video=rearrange(masked_video, 'b f c h w -> b c f h w')
It first normalizes the video sequence to $[-1,1]$, and then uses make_masked_images to set the masked pixels to $0$.
Normally, should we multiply by mask first and then normalize? Is this a design or a bug?
Hi Shiwei, @Steven-SWZhang
Thank you for publically making available the great work you have done.
I have been trying to reproduce the results for the task "video inpainting using a sequence of masks". More specifically, I have a video including 10 frames and 10 masks corresponding to those 10 frames of videos. So, I would like to feed the video alongside the sequence of masks and a text prompt to the model. So, I expect to get a temporally consistent video as an output in a way that the output video adheres to the sequence of masks and the text prompt.
However, I could not see any argument for the input mask. So, I went through the code, and as far as I understood, it seems that the code itself generates a random mask on the input video. The code below (inference_single.py) shows my explanation:
in which the function "make_masked_images" is:
So, as far as I realized, this "mask" variable in line 564 of the first snapshot is initialized with the "batch" variable (which comes from the dataloader) in the picture below:
So, when I went through the "dataset.py" code, I found out the mask is somehow randomly generated as the following:
So, my understanding is that the code only conditions the model on this randomly generated mask. So, if my understanding is correct, does it mean that we cannot feed an external sequence of masks to the model? If the understanding is not correct, I would appreciate it if you could explain how I can feed the sequence of masks to the model as I could not find anything in the code.
Thank you in advance for putting time into this case.
Kind Regards,
Amir
The text was updated successfully, but these errors were encountered: