Skip to content

Fine-tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia

Notifications You must be signed in to change notification settings

astra-vision/ProLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Fine-tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia

Mohammad Fahes1, Tuan-Hung Vu1,2, Andrei Bursuc1,2, Patrick Pérez3, Raoul de Charette1
1 Inria, Paris, France.

2 valeo.ai, Paris, France.

3 Kyutai, Paris, France.

TL; DR: CLIP projects visual embedding to the shared latent space using a linear projection layer. We show that simply fine-tuning this guy (:p) can be a strong alternative to linear probing, prompt tuning and CLIP-adapters, and performs also well on test-time adaptation.

Stay tuned for the code!

Paper: https://arxiv.org/abs/2410.05270

ProLIP

We fine-tune the pretrained linear projection layer of the vision encoder with a regularization loss towards the pre-trained weights.

Citation

@article{fahes2024fine,
  title={Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia},
  author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
  journal={arXiv preprint arXiv:2410.05270},
  year={2024}
}

About

Fine-tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published