Densenet canonizations #171

gumityolcu · 2022-11-01T20:53:36Z

Hello,

Here are a summary of the contributions:

The epsilon values of the batch_norm layers used to be left as they are, when they need to be set to zero for perfect canonization. I fixed this. Note that the epsilon parameter can not be added to the batch_norm_params field of the canonizer. This is because it is a literal, not a torch Variable. So if you add the epsilon parameter there, the code will try to reach batch_norm.eps.data, which does not exits, when trying to restore it. Therefore, i use a new class variable "batch_norm_eps" to remember it during canonization.
CompositeCanonizer now returns the list of handles reversed. This is because, if we have a two canonizers attaching to a module, then we need to detach them in the reverse order that they are applied, in order to restore the original values. I opted to reverse the list in the class because detaching the given handles in returned order seemed more user friendly. And I couldn't think of a use case where this would cause problems.
MergeBatchNormtoRight canonizer is added. This merges a batch normalization layer to a linear layer that comes after it. If the linear layer is a convolutional layer with padding, this is not straightforward. A full feature map needs to be added to the output of the layer instead of a simple bias. This is done by adding forward hooks.
ThreshReLUMergeBatchNorm is added. This canonizer detects BN->ReLU->Linear and changes the activation function to a function that depends on the batch norm variables to get the BatchNorm after the activation. Then the batchnorm is merged to the linear layer that is next to it. This is as described in https://github.com/AlexBinder/LRP_Pytorch_Resnets_Densenet/blob/master/canonization_doc.pdf

Further more BN->ReLU->AvgPool->Linear chains are found and canonized using the same method, because Batch normalization commutes with average pooling.
6.Full proposed canonizers are added to torchvision.py. Another addition is DenseNetAdaptiveAvgPoolCanonizer which is needed before applying other canonizers to densenets. It makes the final ReLU and AvgPooling layers of torchvision densenet objects explicit. By default, these are applied in the forward method of the model, not as nn.module objects.

Thank you very much and I am looking forward to any kind of feedback!

- The returned handles are reversed. - This way when two canonizers change a parameter, removing handles in the returned order will restore the original model

- The epsilon parameter is set to 0 during canonization

- Parameter dimensions are checked before merging, to prevent from attempting merging incompatible layers as in DenseNets.

- Minor change in MergeBatchNorm: set batch_norm.eps=0 in the register method instead of merge_batch_norm - Add MergeBatchNormtoRight canonizer - Merges BathNorm to a linear layer to the right. - If the convolution has padding, one needs to compute a feature map and add it to the output of the convolution to account for the batch norm bias

- Canonizer didn't work correctly when the convolution has bias. This has been handled - The hook function was made lighter by discarding unneeded overhead computation

- Minor change in MergeBatchNormtoRight: remove unused variable - Add DenseNetAdaptiveAvgPoolCanonizer: makes the last adaptive average pooling of torchvision densenets an explicit nn.module object - Add ThreshReLUMergeBatchNorm: Canonizer to canonize BatchNorm -> ReLU -> Linear chains. Adds backwards and forward hooks to ReLU in order to turn it into ThreshReLU as defined in https://github.com/AlexBinder/LRP_Pytorch_Resnets_Densenet/blob/master/canonization_doc.pdf - Add SequentialThreshCanonizer: a composite canonizer that applies DenseNetAvgPoolCanonizer, SequentialMergeBatchNorm, ThreshReLUMergeBatchNorm - Add ThreshSequentialCanonizer: a composite canonizer that applies DenseNetAvgPoolCanonizer, ThreshReLUMergeBatchNorm, SequentialMergeBatchNorm The last two canonizers are the recommended canonizers for torchvision DenseNet implementations. We need to apply the standard SequentialMergeBatchNorm to do away with the initial BN->Conv in the architecture. The two canonizers result in different implementations of the same function because in practice dense blocks have BN->ReLU->Conv->BN->ReLU->Conv which leaves the possibility of using the SequentialMergeBatchNorm inside the DenseBlocks if it is applied before. In practice, both canonizations get rid of the artifacts in the attribution maps. SequentialThreshCanonizer seems to be better quantitatively.

…orrectCompositeCanonizer in the ThreshSequentialCanonizer and SequentialThreshCanonizer classes

… function calls

Docs: Fix docstrings in MergeBatchNormtoRight and ThreshReLUMergeBatchNorm

chr5tphr

Hey Galip,

sorry for the very long hold-up. Let's try to finalize this. Ultimately, we need to rebase this. Maybe you can first introduce the changes and then rebase.

chr5tphr · 2024-04-23T14:18:33Z

src/zennit/canonizers.py

+                    module.canonization_params = {}
+                    module.canonization_params["bias_kernel"] = bias_kernel


let's store these in the canonizer itself, similar to MergeBatchNorm.linear_params and .batch_norm_params

chr5tphr · 2024-04-23T14:19:52Z

src/zennit/canonizers.py

+                module.bias.data = (original_weight * shift).sum(dim=1) + original_bias
+
+        # change batch_norm parameters to produce identity
+        batch_norm.running_mean.data = torch.zeros_like(batch_norm.running_mean.data)
+        batch_norm.running_var.data = torch.ones_like(batch_norm.running_var.data)
+        batch_norm.bias.data = torch.zeros_like(batch_norm.bias.data)
+        batch_norm.weight.data = torch.ones_like(batch_norm.weight.data)
+        batch_norm.eps = 0.


these need to be adapted to the new approach (see current version of MergeBatchNorm)

chr5tphr · 2024-04-23T14:22:51Z

src/zennit/canonizers.py

+
+                    module.canonization_params = {}
+                    module.canonization_params["bias_kernel"] = bias_kernel
+                    return_handles.append(module.register_forward_hook(MergeBatchNormtoRight.convhook))


For the sake of not using Hooks, maybe we can wrap and overwrite the forward function (similar to the ResNet Canonizer)?

chr5tphr · 2024-04-23T14:25:06Z

src/zennit/canonizers.py

+                    temp_module = torch.nn.Conv2d(in_channels=module.in_channels, out_channels=module.out_channels,
+                                                  kernel_size=module.kernel_size, padding=module.padding,
+                                                  padding_mode=module.padding_mode, bias=False)


let's indent one line per kwarg

chr5tphr · 2024-04-23T14:26:22Z

src/zennit/canonizers.py

+
+            if isinstance(module, torch.nn.Conv2d):
+                if module.padding == (0, 0):
+                    module.bias.data = (original_weight * shift[index]).sum(dim=[1, 2, 3]) + original_bias


this needs to be adapted to object.__setattr__(module, 'bias', (original_weight * shift[index]).sum(dim=[1, 2, 3]) + original_bias)

chr5tphr · 2024-04-23T15:20:56Z

src/zennit/torchvision.py

+        of instance, which is why deleting instance attributes with the same name reverts them to the original
+        function.
+        '''
+        self.module.features = Sequential(*list(self.module.features.children())[:-2])


If I remember correctly, you can slice Sequential, i.e. self.module.feature = self.module.features[:-2]

chr5tphr · 2024-04-23T15:21:11Z

src/zennit/torchvision.py

+        '''
+        return DenseNetAdaptiveAvgPoolCanonizer()
+
+    def register(self, module, attributes):


missing docstring

chr5tphr · 2024-04-23T15:21:25Z

src/zennit/torchvision.py

+        for key in self.attribute_keys:
+            delattr(self.module, key)
+
+    def forward(self, x):


missing docstring

chr5tphr · 2024-04-23T15:21:33Z

src/zennit/torchvision.py

+        return out
+
+
+class DenseNetSeqThreshCanonizer(CompositeCanonizer):


missing docstring

chr5tphr · 2024-04-23T15:21:41Z

src/zennit/torchvision.py

+        ))
+
+
+class DenseNetThreshSeqCanonizer(CompositeCanonizer):


missing docstring

gumityolcu and others added 11 commits September 15, 2022 16:26

Canonizers: Fix bug in CompositeCanonizer

28e40df

- The returned handles are reversed. - This way when two canonizers change a parameter, removing handles in the returned order will restore the original model

Canonizers: Fix bug in MergeBatchNorm

38625d3

- The epsilon parameter is set to 0 during canonization

Canonizers: Add dimensionality check to SequentialMergeBatchNorm

1eb17ca

- Parameter dimensions are checked before merging, to prevent from attempting merging incompatible layers as in DenseNets.

Canonizers: Fix MergeBatchNormtoRight

95826dc

- Canonizer didn't work correctly when the convolution has bias. This has been handled - The hook function was made lighter by discarding unneeded overhead computation

Canonizers: remove leftover references to CorrectMergeBatchNorm and C…

5688f16

…orrectCompositeCanonizer in the ThreshSequentialCanonizer and SequentialThreshCanonizer classes

Canonizers: Use relative imports and change code style at a couple of…

10cf96d

… function calls

Canonizers: move densenet canonizers to torchvision.py

4b76c71

Canonizers: change styling and places of torchvision canonizers

7904301

Tests: Add tests for MergeBatchNormtoRight

de7414c

Docs: Fix docstrings in MergeBatchNormtoRight and ThreshReLUMergeBatchNorm

chr5tphr requested changes Apr 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Densenet canonizations #171

Densenet canonizations #171

gumityolcu commented Nov 1, 2022

chr5tphr left a comment

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

chr5tphr Apr 23, 2024

		module.canonization_params = {}
		module.canonization_params["bias_kernel"] = bias_kernel

		return out


		class DenseNetSeqThreshCanonizer(CompositeCanonizer):

Densenet canonizations #171

Are you sure you want to change the base?

Densenet canonizations #171

Conversation

gumityolcu commented Nov 1, 2022

chr5tphr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment