-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which layer of DINOv2 do you align with? #11
Comments
@Luciennnnnnn We simply use the last layer. |
@JingfengYao Is this right? features = self.foundation_model.forward_features(rescale_inputs)["x_norm_patchtokens"] |
By the way, how do you align the resolution of latent vector and the feature of DINOv2? |
Here are my implementations:
|
Thanks! I want to use a vae with 8x downsampling, what's your opinion on aligning resolution? |
@Luciennnnnnn DINOv2 should support resolution between 224 to 518. In my case, I would likely begin by feeding a 448-sized image directly into DINOv2. That said, since this configuration remains untested, its efficacy cannot be ascertained at this stage. |
That's sounds reasonable, I see REPA use same strategy. |
Hi, I'm trying to reproduce the training of VA-VAE, which layer of DINOv2 do you align with?
The text was updated successfully, but these errors were encountered: