You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been analyzing both the pytorch and tensorflow repos along with the paper for a while. I have come to my own conclusions, (which could be wrong). I hope this might be helpful to find potential issues (or conversely help me to shed light on things that I misunderstood).
I have some concern with the tensorflow implementation from a theoretical perspective and would like to point out the differences between the 2 repos and implementation issues of pytorch code:
Theoretical questions:
The loss for the auxiliary classifiers (CAM loss) is really high compared to the other losses (100x). As a result, in the encoder of the generator (shared), we are essentially get a CAM (class activation map) as the encoder will be dominated by the task of classification. I understand that it is more efficient to share the encoder between the 2 tasks (generation and localization of most prominent differences between the 2 domains), but this potentially leads to a case where the only information that the generator have access to is a probability map of domain difference (less color/texture information from the source image). As a result the generator knows more where and how much to adjust but has lass clue about how to adjust.
A single output is used in the auxiliary classifiers (actually 2, one from global maxpooling and one from global average pooling but they both predict the probability of the input being a source image). As a result, the CAM loss advances the heat map to be inverted when we feed a target image by setting the weights to negative in the fc layers. This way if we feed a target image we can get large negative values that is approximately 0 on the output logit as we would expect. I assume that you are using the pointwise convolutional layer right after to counteract this (to invert back the maps that detect target features that are different from the source) Wouldn't be it easier to use separate logits for source and target images (4 in total instead of 2).
Repo differences/potential implementation issues:
There are no biases in the fully connected layers nor in the convolutional layers but we have instance normalization so that is okay (except in the case of layer instance normalization) in the pytorch code. Conversely in the tensorflow code most of the biases could be removed from convolutional layers with instance normalization (reduces model size).
Requires_grad flag is not set to False nor the generated image tensor is detached in the pytorch code before we backprop the adversarial loss of the discriminators -> Although we use different optimizer for the generator and discriminator (no update on the generator) we compute the gradients for the generator too which is unefficient
The text was updated successfully, but these errors were encountered:
Aenteas
changed the title
Implementation
Potential implementation issues and theoretical questions
Sep 2, 2020
Thank you for sharing the code!
I have been analyzing both the pytorch and tensorflow repos along with the paper for a while. I have come to my own conclusions, (which could be wrong). I hope this might be helpful to find potential issues (or conversely help me to shed light on things that I misunderstood).
I have some concern with the tensorflow implementation from a theoretical perspective and would like to point out the differences between the 2 repos and implementation issues of pytorch code:
Theoretical questions:
Repo differences/potential implementation issues:
The text was updated successfully, but these errors were encountered: