-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Mismatched pretrained model architectures in efficientnetb4.py #79
Comments
After further investigation, I'm finding that there are a few import issues caused by strict state_dict() imports due to differences in initializations of the model. currently found issues in efficientnet and I3D. Examples of the error below:
Let me know if any of you have run into this issue and know a quick solution. |
You just need to set the path of the function that loads the weights to None, and it will automatically download them from the internet. No other modifications are necessary. |
what function are you referring to? |
Please change this address to None.You will be able to correctly load the pre-trained weights without losing the match. |
If I set that path to None, Efficientnet-pytorch loads imagenet trained data. How do I load the weights that are provided by deepfakebench? |
The original wording in the deepfake bench readme is: "To run the training code, you should first download the pretrained weights for the corresponding backbones (These pre-trained weights are from ImageNet). You can download them from [Link]. After downloading, you need to put all the weights files into the folder ./training/pretrained." They only provide the pre-trained weights from ImageNet. |
there are pre-trained weights for detectors located at this link which is what I am trying to load (also linked at the top of the readme). These weights do not match the architecture for ImageNet classification, but do match the detector architecture. For example, they contain |
Sorry, I didn't see the weights before. |
Hi @bendoesai thanks for highlighting this problem.
The model seems to load as expected with two output classes on the last layer. The same solution might also be applied for the I3D model, by changing a bit the Step 2 (though I haven't tested it):
|
Hi all, I'm doing deepfake detection using the EfficientNetB4 architecture. I ran into an issue when I tried to load the pretrained weights provided.
After some digging, I found the source of the problem to be in the initializer of the EfficientNetB4 object (
\training\networks\efficientnetb4.py
). Line 32 is where the class invokes theefficientnet_pytorch
package to generate the EfficientNetB4 architecture, either empty or pretrained on ImageNet. The problem is that EfficientNet trained on ImageNet doesn't match the architecture of the pretrained weights provided for deepfake detection. There's two possible ways I can think to resolve this issue:Let
efficientnet_pytorch
do all the work by passing weights_path as well as in_channels and num_classes to thefrom_pretrained()
call. This would require that efficientnetb4 be retrained so that the naming conventions of the saved weights matches that of efficientnet_pytorch.(what I did) Load weights after reshaping the model. This option would eliminate the
from_pretrained()
call and instead load the model architecture (usingfrom_name()
), reshape it as needed, and then callat the end of
__init__()
. This would allow the current set of pretrained weights to work within DeepfakeBench and allow more control of model output reshaping.I can provide the code I am currently using that I have validated on both training and test scripts if necessary. This is my first time submitting an issue on GitHub, so apologies if there's anything missing.
The text was updated successfully, but these errors were encountered: