UMIC

This repository provides an implementation for the unferenced image captioning metric presented in our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning.

Usage

There are 3 steps for running the code:

Download the pretrained checkpoint (about 220MB) of UMIC.
Download the pre-computed visual features(img_db) for the dataset you want to compute the score.
Run the preprocess code for your candidate captions to make textual features(txt_db).

Then you can easily compute the scores for your image-caption pairs using the compute_score.py.

1. Install Prerequisites

Create a Python 3.6 environment and then install the requirements from requirements.txt:

conda create -name umic python=3.6
pip install -r requirements.txt

2. Download the Pretrained Model

Download umic.tar.gz and extract it. (the default directory in the code is ./ckpt)

3. Download the Precomputed Visual Features

Please refer to the offical repo of UNITER for computing the visual features for other datasets using the raw image.

4. Pre-processing the Textual Features (Captions)

We provide the processed version for four datasets we used in the paper in txt_db dir.
To process new captions, please process the data as follows.

The format of textual feature file(python dictionary, json format) is a list of the dictionary like the below:

'caption' : [candidate catpion]
'imgid' : [image id for the caption in each dataset.]

Please refer to sample.json as an example format.
Note that we regard each image file name as dataset_name_image_id.jpg following the coco dataset.

Using the '.json' format that has the list composted of these dictionaries, please preprocess the file using the following command.

python make_txt_db.py --input_file $INPUT_JSON_FILE \
                      --img_type $IMG_DATSET_NAME (e.g. 'coco_val2014' for capeval1k) \
                      --out_dir $PATH_TO_OUTPUT_DIR

5. Running the Script

For each image-caption pair, please compute the score using the follwing script. For example, if you want to compute the score for COCO captioning test set, you can use img_db for *coco_val2014* and use the txt_db for your own prediction results.

python compute_score.py --img_db $IMG_DB_DIR \
                              --txt_db $TXT_DB_DIR \
                              --out_file $OUT_FILE_NAME(.json format) \
                              --ckpt $CKPT_DIR (default is ckpt/umic.pt)

Reference

If you find this repo useful, please consider citing our ACL 2021 paper:

@inproceedings{lee-etal-2021-umic,
    title = "{UMIC}: An Unreferenced Metric for Image Captioning via Contrastive Learning",
    author = "Lee, Hwanhee  and
      Yoon, Seunghyun  and
      Dernoncourt, Franck  and
      Bui, Trung  and
      Jung, Kyomin",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-short.29",
    doi = "10.18653/v1/2021.acl-short.29",
    pages = "220--226",
}

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
config		config
data		data
model		model
txt_db		txt_db
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compute_metric.py		compute_metric.py
make_txt_db.py		make_txt_db.py
prepro.py		prepro.py
requirements.txt		requirements.txt
sample.json		sample.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UMIC

Usage

1. Install Prerequisites

2. Download the Pretrained Model

3. Download the Precomputed Visual Features

4. Pre-processing the Textual Features (Captions)

5. Running the Script

Reference

About

Releases

Packages

Contributors 2

Languages

License

hwanheelee1993/UMIC

Folders and files

Latest commit

History

Repository files navigation

UMIC

Usage

1. Install Prerequisites

2. Download the Pretrained Model

3. Download the Precomputed Visual Features

4. Pre-processing the Textual Features (Captions)

5. Running the Script

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages