MDMM - Learning multi-domain multi-modality I2I translation

Last update: Nov 04, 2022

Related tags

Overview

Multi-Domain Multi-Modality I2I translation

Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to the "Diverse Image-to-Image Translation via Disentangled Representations(https://arxiv.org/abs/1808.00948)", ECCV 2018. With the disentangled representation framework, we can learn diverse image-to-image translation among multiple domains. [DRIT]

Contact: Hsin-Ying Lee ([email protected]) and Hung-Yu Tseng ([email protected])

Example Results

Prerequisites

Python 3.5 or Python 3.6
Pytorch 0.4.0 and torchvision (https://pytorch.org/)
TensorboardX
Tensorflow (for tensorboard usage)
Docker file based on CUDA 9.0, CuDNN 7.1, and Ubuntu 16.04 is provided in the [DRIT] github page.

Usage

Training

python train.py --dataroot DATAROOT --name NAME --num_domains NUM_DOMAINS --display_dir DISPLAY_DIR --result_dir RESULT_DIR --isDcontent

Testing

python test.py --dataroot DATAROOT --name NAME --num_domains NUM_DOMAINS --out_dir OUT_DIR --resume MODEL_DIR --num NUM_PER_IMG

Datasets

We validate our model on two datasets:

art: Containing three domains: real images, Monet images, uki-yoe images. Data can be downloaded from CycleGAN website.
weather: Containing four domains: sunny, cloudy, snowy, and foggy. Data is randomly selected from the Image2Weather dataset website.

The different domains in a dataset should be placed in folders "trainA, trainB, ..." in the alphabetical order.

Models

The pretrained model on the art dataset

bash ./models/download_model.sh art

The pretrained model on the weather dataset

bash ./models/download_model.sh weather

Note

The feature transformation (i.e. concat 0) is not fully tested since both art and weather datasets do not require shape variations
The hyper-parameters matter and are task-dependent. They are not carefully selected yet.
Feel free to contact the author for any potential improvement of the code.

Paper

Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee*, Hung-Yu Tseng*, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang
European Conference on Computer Vision (ECCV), 2018 (oral) (* equal contribution)

Please cite our paper if you find the code or dataset useful for your research.

@inproceedings{DRIT,
  author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh Kumar and Yang, Ming-Hsuan},
  booktitle = {European Conference on Computer Vision},
  title = {Diverse Image-to-Image Translation via Disentangled Representations},
  year = {2018}
}

MDMM - Learning multi-domain multi-modality I2I translation

Related tags

Overview

Multi-Domain Multi-Modality I2I translation

Example Results

Prerequisites

Usage

Datasets

Models

Note

Paper

Owner

Hsin-Ying Lee

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Optimizing synthesizer parameters using gradient approximation

A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Implementation of "Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency"

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

A quick recipe to learn all about Transformers

ToFFi - Toolbox for Frequency-based Fingerprinting of Brain Signals

A platform to display the carbon neutralization information for researchers, decision-makers, and other participants in the community.

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Rethinking Transformer-based Set Prediction for Object Detection

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

for taichi voxel-challange event

This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"