Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Last update: Dec 31, 2022

Related tags

Deep Learning T2I_CL

Overview

T2I_CL

This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Requirements

Linux
Python ≥ 3.6
PyTorch ≥ 1.4.0

Prepare Data

Download the preprocessed datasets from AttnGAN

Alternatively, another site is from DM-GAN

Training

Pretrain DAMSM+CL:
- For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
- For coco dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 0
Train AttnGAN+CL:
- For bird dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 0
- For coco dataset: python main.py --cfg cfg/coco_attn2.yml --gpu 0
Train DM-GAN+CL:
- For bird dataset: python main.py --cfg cfg/bird_DMGAN.yml --gpu 0
- For coco dataset: python main.py --cfg cfg/coco_DMGAN.yml --gpu 0

Pretrained Models

DAMSM+CL for bird. Download and save it to DAMSMencoders/
DAMSM+CL for coco. Download and save it to DAMSMencoders/
AttnGAN+CL for bird. Download and save it to models/
AttnGAN+CL for coco. Download and save it to models/
DM-GAN+CL for bird. Download and save it to models/
DM-GAN+CL for coco. Download and save it to models/

Evaluation

Sampling and get the R-precision:
- python main.py --cfg cfg/eval_bird.yml --gpu 0
- python main.py --cfg cfg/eval_coco.yml --gpu 0
Inception score:
- python inception_score_bird.py --image_folder fake_images_bird
- python inception_score_coco.py fake_images_coco
FID:
- python fid_score.py --gpu 0 --batch-size 50 --path1 real_images_bird --path2 fake_images_bird
- python fid_score.py --gpu 0 --batch-size 50 --path1 real_images_coco --path2 fake_images_coco

Citation

If you find this work useful in your research, please consider citing:

@article{ye2021improving,
  title={Improving Text-to-Image Synthesis Using Contrastive Learning},
  author={Ye, Hui and Yang, Xiulong and Takac, Martin and Sunderraman, Rajshekhar and Ji, Shihao},
  journal={arXiv preprint arXiv:2107.02423},
  year={2021}
}

Acknowledge

Our work is based on the following works:

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Related tags

Overview

T2I_CL

Requirements

Prepare Data

Training

Pretrained Models

Evaluation

Citation

Acknowledge

Owner

A Python Reconnection Tool for alt:V

An AutoML Library made with Optuna and PyTorch Lightning

simple artificial intelligence utilities

An implementation of the "Attention is all you need" paper without extra bells and whistles, or difficult syntax

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

A tensorflow model that predicts if the image is of a cat or of a dog.

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Hand Gesture Volume Control | Open CV | Computer Vision

EfficientNetV2 implementation using PyTorch

Deeplearning project at The Technological University of Denmark (DTU) about Neural ODEs for finding dynamics in ordinary differential equations and real world time series data

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

Automatic tool focused on deriving metallicities of open clusters

Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

C3D is a modified version of BVLC caffe to support 3D ConvNets.

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

Pytorch implementation of SimSiam Architecture