Attention Map example #7

fedegonzal · 2023-10-30T06:37:56Z

Hi,

Since this repo is based on the paper "ViT needs registers" should be usefull to publish some example to visualize attention maps.

Thanks for this great work!

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

kyegomez · 2023-11-01T23:33:05Z

@fedegonzal yes this would be a good idea, let's work together on this. Can you make an example?

aronvandepol · 2023-12-18T22:21:47Z

I'd also be very curious to see some examples! 🙌

github-actions · 2024-02-17T12:40:37Z

Stale issue message

wylapp · 2024-06-13T11:36:54Z

Hello everyone!

I managed to draw the attention map, and the artifacts do exist.

Other than the attention map, has anyone tried to draw the norms plot in Figure 3 (from "Vision Transformers Need Registers")?
I got the opposite result, where there are no outlier patches.

We observe that an important difference between “artifact” patches and other patches is the norm of their token embedding at the output of the model.

This is the only sentence that authors explain what an outlier patch/token is.

FYI, this is my implementation.
https://colab.research.google.com/drive/1gHDOi8RL8hHmfAJvF7IqyBF7tmosG0ko

zwyang6 · 2024-08-17T04:22:51Z

+1 demos are neccessary 🙌

TumVink · 2024-09-17T16:11:35Z

Hello everyone!

I managed to draw the attention map, and the artifacts do exist.

Other than the attention map, has anyone tried to draw the norms plot in Figure 3 (from "Vision Transformers Need Registers")? I got the opposite result, where there are no outlier patches.

We observe that an important difference between “artifact” patches and other patches is the norm of their token embedding at the output of the model.

This is the only sentence that authors explain what an outlier patch/token is.

FYI, this is my implementation. https://colab.research.google.com/drive/1gHDOi8RL8hHmfAJvF7IqyBF7tmosG0ko

Heyy,
I am having a look at your script. Your attention_map seems good, however why L2 norm is based on Outputs[0]? Where does it come from?

wylapp · 2024-09-20T01:39:38Z

As aforementioned, the authors define the outlier patch as follows:

We observe that an important difference between “artifact” patches and other patches is the norm of their token embedding at the output of the model.

The output[0] is the model's last hidden state, which has the shape of [Batch_size, Sequence_length, Embedding_size]. I think this can be seen as the "token embedding at the output of the model". Then calculate the L2-norm in the last dimension using torch.norm(outputs[0], 2, dim=-1).

TumVink · 2024-09-20T14:25:05Z

It makes sense!
However, I re-produces the hign-norm outliers with the codes from here with my own custom images.

fedegonzal assigned kyegomez Oct 30, 2023

github-actions bot added the no-issue-activity label Feb 17, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention Map example #7

Attention Map example #7

fedegonzal commented Oct 30, 2023 •

edited by polar-sh bot

Loading

kyegomez commented Nov 1, 2023

aronvandepol commented Dec 18, 2023

github-actions bot commented Feb 17, 2024

wylapp commented Jun 13, 2024

zwyang6 commented Aug 17, 2024

TumVink commented Sep 17, 2024

wylapp commented Sep 20, 2024

TumVink commented Sep 20, 2024

Attention Map example #7

Attention Map example #7

Comments

fedegonzal commented Oct 30, 2023 • edited by polar-sh bot Loading

Upvote & Fund

kyegomez commented Nov 1, 2023

aronvandepol commented Dec 18, 2023

github-actions bot commented Feb 17, 2024

wylapp commented Jun 13, 2024

zwyang6 commented Aug 17, 2024

TumVink commented Sep 17, 2024

wylapp commented Sep 20, 2024

TumVink commented Sep 20, 2024

fedegonzal commented Oct 30, 2023 •

edited by polar-sh bot

Loading