-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Audiovisual SlowFast Networks into the repo #219
base: main
Are you sure you want to change the base?
Conversation
Hi @fanyix! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but we do not have a signature on file. In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@takatosp1 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
@fanyix has updated the pull request. You must reimport the pull request before landing. |
@fanyix has updated the pull request. You must reimport the pull request before landing. |
Hi, There is a problem with the visualization of video. Audio are not provided so the next error is printed. Traceback (most recent call last): |
Hey @fanyix Thanks for committing the code into PySlowFast, could you remind me if you have trained the av slowfast model in PySlowFast codebase? Could you remind me what is the performance you got? Thanks, |
Hi Haoqi, I haven't trained from scratch using pySlowFast, however I did try converting caffe2 models and finetune it with a short schedule in pySlowFast. I got something around 0.5 gap to the number achieved in the caffe2 codebase. It's possible my finetuning schedule is not optimized so it would be great if you can try it on your end (or even better train from scratch). |
self.t_relu = nn.ReLU(inplace=self._inplace_relu) | ||
|
||
# 1x1x3, BN, ReLU. | ||
self.f = nn.Conv3d( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I am trying to re-implement AVSlowfast, and feel confused here.
In the paper (Table 1), the dim_inner should be the same as the planes, but the code here shows that the dim_inner is 2 * planes. i.e., for res2: [3×1, 1×3], 32 should be [3×1, 1×3], 64.
dim_inner // cfg.SLOWFAST.AU_BETA_INV | ||
], | ||
temp_kernel_sizes=temp_kernel[1], | ||
stride=[1] * 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, here the stride
means the stride of res1 of the audio pathway is 1. However in the paper Downsampling in time-frequency space is performed by stride 2^2 convolution in the center (“bottleneck”) filter of the first residual block in each stage from res2 to res5. In the code, it seems that res2 is not included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theschnitz has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@theschnitz has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Please see projects/avslowfast/README.md for a starter for training and evaluation with an AVSlowFast 4x16 R50 model.