Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why batchnorm as final layer? #1

Open
Rasmuskh opened this issue Oct 29, 2021 · 2 comments
Open

Why batchnorm as final layer? #1

Rasmuskh opened this issue Oct 29, 2021 · 2 comments

Comments

@Rasmuskh
Copy link

Hi,
I noticed that you add a batchnorm layer as the final layer of your VGG-like network. Could you explain why this is necessary?

I am using your code to train a ResNet18 model using your BayesBinn optimizer, and noticed that it is necessary to add batchnorm at the output layer for this model as well in order to achieve good performance (with batchnorm at the output layer it performs very well).

@mengxiangming
Copy link
Collaborator

Hi Rasmuskh,

Thank you for this question, which is an interesting observation.

For the VGG-like network, we simply use the same network structure as the following paper:
Alizadeh, Milad, et al. "An empirical study of binary neural networks' optimization." ICLR2018.

The code of Alizadeh et al 2018 can be found here.

We did not make a detailed analysis of the network structure and just use the same one as Alizadeh et al 2018 for ease of comparison. Intuitively, it might be that the final output value without normalization is not suitable for the loss function used, e.g., the absolute magnitue is too large due to the constraint of the binary weights. This can be checked by plotting the histograms of the output value without normalization and compare it with that of the BN output. Hope this conjecture will be helpful.

Best regards,
Xiangming

@Rasmuskh
Copy link
Author

Thank you,
That is very helpful :)
I will have a look at the Alizadeh paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants