Skip to content

Code for our Paper "All in an Aggregated Image for In-Image Learning"

Notifications You must be signed in to change notification settings

AGI-Edgerunners/IIL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In-Image Learning

Code for the paper "All in an Aggregated Image for In-Image Learning".

IIL case

Requirement

pip install -r requirements.txt

Download Dataset

The processed dataset and demonstration examples are available from this link. Unzip the file after downloading and keep the dataset directory in the root directory of the project.

----IIL
    |----dataset
    |----src
    ...

Run In-Image Learning and Baselines

In-Image Learning

python run_iil.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Visual-text interleaved in-context learning

python run_vticl.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Text-only in-context learning

python run_ticl.py --exp_name exp_on_mv --dataset mathvista --lt few_shot

Cite

If you find In-Image Learning useful for your research and applications, please kindly cite using this BibTeX:

@misc{wang2024single,
      title={All in a Single Image: Large Multimodal Models are In-Image Learners}, 
      author={Lei Wang and Wanyu Xu and Zhiqiang Hu and Yihuai Lan and Shan Dong and Hao Wang and Roy Ka-Wei Lee and Ee-Peng Lim},
      year={2024},
      eprint={2402.17971},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

Code for our Paper "All in an Aggregated Image for In-Image Learning"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages