Reproduce features on TVSum and SumMe #66

mpalaourg · 2021-03-31T13:44:01Z

Hi,

I am trying to compute the features of each frame in the video (on SumMe and TVSum). My features, when they are indexed per 15 frames, match in dimension with the features provided here, but the values are different. I searched both the code and the other issues, and I found here that you mention preprocess(frame), but not exactly your steps. I guess, that is our difference.

My preprocessing steps are:

load the video with shape [frames, channels, height, width], with desired_fps=2 and desired_size=(224, 224).
Then use this transformation

   transform = transforms.Compose([
       transforms.Lambda(lambda x: x / 255),  # [0, 255] -> [0, 1]
       transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

and finally -after the computation though GoogleNet- divide each vector with its norm, to get a unit feature-vector.

The text was updated successfully, but these errors were encountered:

HERIUN · 2022-12-26T08:52:48Z

Hi. I'm also trying to reproduce feature of vsumm video. but failed..

My all process steps are

import torch
from torchvision.models import googlenet

self.device = torch.device('cuda:0')
self.googlenet = googlenet(pretrained=True)
self.extractor = torch.nn.Sequential(*list(self.googlenet.children())[:-2]).to(self.device)

self.preprocess = transforms.Compose([ # https://pytorch.org/hub/pytorch_vision_googlenet/
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])


im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # BGR to RGB
im = Image.fromarray(im) # cv2 to PIL
im = self.preprocess(im)
im = im.unsqueeze(0).to(self.device) # it should be shape : (1,3,224,224

with torch.no_grad():
            feature = self.extractor(im).cpu().numpy().flatten() # [1(N), 1024, 1, 1] -> [1024]

aosiddiqui · 2024-06-04T09:53:41Z

Are the features shared in the files eccv16_dataset_summe_google_pool5.h5 and eccv16_dataset_tvsum_google_pool5.h5 ResNet152 pool5 features?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce features on TVSum and SumMe #66

Reproduce features on TVSum and SumMe #66

mpalaourg commented Mar 31, 2021

HERIUN commented Dec 26, 2022 •

edited

Loading

aosiddiqui commented Jun 4, 2024

Reproduce features on TVSum and SumMe #66

Reproduce features on TVSum and SumMe #66

Comments

mpalaourg commented Mar 31, 2021

HERIUN commented Dec 26, 2022 • edited Loading

aosiddiqui commented Jun 4, 2024

HERIUN commented Dec 26, 2022 •

edited

Loading