question about FlexiViT #64

KimWu1994 · 2023-01-18T08:23:39Z

KimWu1994
Jan 18, 2023

FlexiViT is a very imaginative work.
I was also bothered by the flexible patch size.
I want to know how to implement PI-resize in Section 3.4 in the code.
And how to optimize the PI-resize in training.

Does PI-resize need to set learnable parameters?
Does the loss function need to be used for constraint and optimization?

lucasb-eyer · 2023-01-18T09:58:13Z

lucasb-eyer
Jan 18, 2023

Hi, thanks for your interest!

The implementation of PI-resize during training is here: https://github.com/google-research/big_vision/blob/main/big_vision/models/proj/flexi/vit.py#L30-L75

In words: PI-resize does not introduce any new trainable parameters. You define some learnable parameter for the patch-embedding just like in regular ViT: pick any patch-size, doesn't really matter what, we use 32x32, so allocate a 32x32x3x[model-dim] buffer. Then, before passing that to the conv operation for patch-embedding, multiply it with the PI-resize matrix. That matrix can be computed analytically once at the start and is not trained, see code pointer above.

I'm not sure what loss you mean - there is no need to change whatever loss you are using when "flexifying" your training loop.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about FlexiViT #64

{{title}}

Replies: 1 comment

{{title}}

Select a reply

question about FlexiViT #64

KimWu1994 Jan 18, 2023

Replies: 1 comment

lucasb-eyer Jan 18, 2023

KimWu1994
Jan 18, 2023

lucasb-eyer
Jan 18, 2023