An unreferenced image captioning metric (ACL-21)

Last update: Nov 20, 2022

Related tags

Overview

UMIC

This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning.
Here, we provide the code to compute UMIC.

Usage (Updating the Descriptions)

Our code is based on UNITER. Therefore, please follow the install guideline for using Docker to load UNITER. In the next few weeks, we try to release the version without using the docker.

1. Install Prerequisites

We used the Docker image provided by the official repo of UNITER. Using the guideline in the repo, please install the docker.

2. Download the Visual Features

For image captioning task, COCO dataset is widely used. To download the visual features for coco captions, just download the image features for coco validation splits using the following command.

wget https://acvrpublicycchen.blob.core.windows.net/uniter/img_db/coco_val2014.tar

Please refer to the offical repo of UNITER for downloading other visual features.

3. Pre-processing the Textual Features (Captions)

The format of textual feature file(python dictionary, json format) is as follows:
'cands' : [list of candidate captions]
'img_fs' : [list of image file names]

4. Running the Script

Launching Docker

source launch_activate.sh $PATH_TO_STORAGE

Compute Score

python compute_score.py --data_type capeval1k \
                              --ckpt /storage/umic.pt \
                              --img_type \ coco_val2014 \

Reference

If you find this repo useful, please consider citing:

@inproceedings{lee-etal-2021-umic,
    title = "{UMIC}: An Unreferenced Metric for Image Captioning via Contrastive Learning",
    author = "Lee, Hwanhee  and
      Yoon, Seunghyun  and
      Dernoncourt, Franck  and
      Bui, Trung  and
      Jung, Kyomin",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-short.29",
    doi = "10.18653/v1/2021.acl-short.29",
    pages = "220--226",
}

An unreferenced image captioning metric (ACL-21)

Related tags

Overview

UMIC

Usage (Updating the Descriptions)

1. Install Prerequisites

2. Download the Visual Features

3. Pre-processing the Textual Features (Captions)

4. Running the Script

Reference

Owner

hwanheelee

The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

A time series processing library

Segmentation-Aware Convolutional Networks Using Local Attention Masks

Weakly-supervised object detection.

Pytorch implementation of Masked Auto-Encoder

A simple, fully convolutional model for real-time instance segmentation.

Modular Probabilistic Programming on MXNet

Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Running AlphaFold2 (from ColabFold) in Azure Machine Learning

A simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

Select, weight and analyze complex sample data

Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Python program that works as a contact list

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

An implementation of a discriminant function over a normal distribution to help classify datasets.

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"