Skip to content

Question About Listed ViT Models in the configs/proj/flexivit/README.md #69

Answered by lucasb-eyer
mhamzaerol asked this question in Q&A
Discussion options

You must be logged in to vote

Hi, thanks for your interest and the question!

You almost got it. For simplicity/uniformity of implementation, we also used the "underlying" patch and posemb sizes of 32 and 7 for the baseline models. Figures 17 (b) and (c) in the appendix show that this change has absolutely no effect on the results even for regular (not flexi) ViT models.

So, for the patch embeddings you can just resize them to 16 and 30 at load-time with PI-resize, and for the position embedding, resize them the usual way at load time, i.e. (bi)linear interpolation, the code does these here: https://github.com/google-research/big_vision/blob/main/big_vision/models/proj/flexi/vit.py#L198-L206

To be clear, I did not go a…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by lucasb-eyer
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #31 on November 07, 2023 13:12.