Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

Overview

DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral)

Project | Paper

Official PyTorch implementation of the paper: "DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample".

DeepSIM: Given a single real training image (b) and a corresponding primitive representation (a), our model learns to map between the primitive (a) to the target image (b). At inference, the original primitive (a) is manipulated by the user. Then, the manipulated primitive is passed through the network which outputs a corresponding manipulated image (e) in the real image domain.


DeepSIM was trained on a single training pair, shown to the left of each sample. First row "face" output- (left) flipping eyebrows, (right) lifting nose. Second row "dog" output- changing shape of dog's hat, removing ribbon, and making face longer. Second row "car" output- (top) adding wheel, (bottom) conversion to sports car.


DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample
Yael Vinker*, Eliahu Horwitz*, Nir Zabari, Yedid Hoshen
*Equal contribution
https://arxiv.org/pdf/2007.01289

Abstract: We present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline (TPS) as an effective augmentation. Our network learns to map between a primitive representation of the image to the image itself. The choice of a primitive representation has an impact on the ease and expressiveness of the manipulations and can be automatic (e.g. edges), manual (e.g. segmentation) or hybrid such as edges on top of segmentations. At manipulation time, our generator allows for making complex image changes by modifying the primitive input representation and mapping it through the network. Our method is shown to achieve remarkable performance on image manipulation tasks.

Getting Started

Setup

  1. Clone the repo:
git clone https://github.com/eliahuhorwitz/DeepSIM.git
cd DeepSIM
  1. Create a new environment and install the libraries:
python3.7 -m venv deepsim_venv
source deepsim_venv/bin/activate
pip install -r requirements.txt


Training

The input primitive used for training should be specified using --primitive and can be one of the following:

  1. "seg" - train using segmentation only
  2. "edges" - train using edges only
  3. "seg_edges" - train using a combination of edges and segmentation
  4. "manual" - could be anything (for example, a painting)

For the chosen option, a suitable input file should be provided under /"train_" (e.g. ./datasets/car/train_seg). For automatic edges, you can leave the "train_edges" folder empty, and an edge map will be generated automatically. Note that for the segmentation primitive option, you must verify that the input at test time fits exactly the input at train time in terms of colors.

To train on CPU please specify --gpu_ids '-1'.

  • Train DeepSIM on the "face" video using both edges and segmentations (bash ./scripts/train_face_vid_seg_edges.sh):
#!./scripts/train_face_vid_seg_edges.sh
python3.7 ./train.py --dataroot ./datasets/face_video --primitive seg_edges --no_instance --tps_aug 1 --name DeepSIMFaceVideo
  • Train DeepSIM on the "car" image using segmentation only (bash ./scripts/train_car_seg.sh):
#!./scripts/train_car_seg.sh
python3.7 ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
  • Train DeepSIM on the "face" image using edges only (bash ./scripts/train_face_edges.sh):
#!./scripts/train_face_edges.sh
python3.7 ./train.py --dataroot ./datasets/face --primitive edges --no_instance --tps_aug 1 --name DeepSIMFace

Testing

  • Test DeepSIM on the "face" video using both edges and segmentations (bash ./scripts/test_face_vid_seg_edges.sh):
#!./scripts/test_face_vid_seg_edges.sh
python3.7 ./test.py --dataroot ./datasets/face_video --primitive seg_edges --phase "test" --no_instance --name DeepSIMFaceVideo --vid_mode 1 --test_canny_sigma 0.5
  • Test DeepSIM on the "car" image using segmentation only (bash ./scripts/test_car_seg.sh):
#!./scripts/test_car_seg.sh
python3.7 ./test.py --dataroot ./datasets/car --primitive seg --phase "test" --no_instance --name DeepSIMCar
  • Test DeepSIM on the "face" image using edges only (bash ./scripts/test_face_edges.sh):
#!./scripts/test_face_edges.sh
python3.7 ./test.py --dataroot ./datasets/face --primitive edges --phase "test" --no_instance --name DeepSIMFace

Additional Augmentations

As shown in the supplementary, adding augmentations on top of TPS may lead to better results

  • Train DeepSIM on the "face" video using both edges and segmentations with sheer, rotations, "cutmix", and canny sigma augmentations (bash ./scripts/train_face_vid_seg_edges_all_augmentations.sh):
#!./scripts/train_face_vid_seg_edges_all_augmentations.sh
python3.7 ./train.py --dataroot ./datasets/face_video --primitive seg_edges --no_instance --tps_aug 1 --name DeepSIMFaceVideoAugmentations --cutmix_aug 1 --affine_aug "shearx_sheary_rotation" --canny_aug 1
  • When using edges or seg_edges, it may be beneficial to have white edges instead of black ones, to do so add the --canny_color 1 option
  • Check ./options/base_options.py for more augmentation related settings
  • When using edges or seg_edges and adding edges manually at test time, it may be beneficial to apply "skeletonize" (e.g skimage skeletonize )on the edges in order for them to resemble the canny edges

More Results

Top row - primitive images. Left - original pair used for training. Center- switching the positions between the two rightmost cars. Right- removing the leftmost car and inpainting the background.


The leftmost column shows the source image, then each column demonstrate the result of our model when trained on the specified primitive. We manipulated the image primitives, adding a right eye, changing the point of view and shortening the beak. Our results are presented next to each manipulated primitive. The combined primitive performed best on high-level changes (e.g. the eye), and low-level changes (e.g. the background).


On the left is the training image pair, in the middle are the manipulated primitives and on the right are the manipulated outputs- left to right: dress length, strapless, wrap around the neck.

Single Image Animation

Animation to Video

Video to Animation

Citation

If you find this useful for your research, please use the following.

@InProceedings{Vinker_2021_ICCV,
    author    = {Vinker, Yael and Horwitz, Eliahu and Zabari, Nir and Hoshen, Yedid},
    title     = {Image Shape Manipulation From a Single Augmented Training Sample},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {13769-13778}
}

Acknowledgments

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

xcfeng 55 Dec 27, 2022
A best practice for tensorflow project template architecture.

A best practice for tensorflow project template architecture.

Mahmoud Gamal Salem 3.6k Dec 22, 2022
Anomaly Localization in Model Gradients Under Backdoor Attacks Against Federated Learning

Federated_Learning This repo provides a federated learning framework that allows to carry out backdoor attacks under varying conditions. This is a ker

Arçelik ARGE Açık Kaynak Yazılım Organizasyonu 0 Nov 30, 2021
details on efforts to dump the Watermelon Games Paprium cart

Reminder, if you like these repos, fork them so they don't disappear https://github.com/ArcadeHustle/WatermelonPapriumDump/fork Big thanks to Fonzie f

Hustle Arcade 29 Dec 11, 2022
This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

0 Feb 02, 2022
Byzantine-robust decentralized learning via self-centered clipping

Byzantine-robust decentralized learning via self-centered clipping In this paper, we study the challenging task of Byzantine-robust decentralized trai

EPFL Machine Learning and Optimization Laboratory 4 Aug 27, 2022
Tooling for converting STAC metadata to ODC data model

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

Open Data Cube 65 Dec 20, 2022
FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning PyTorch implementation for the paper: FACIAL: Synthesizing Dynamic Talking

226 Jan 08, 2023
ESL: Event-based Structured Light

ESL: Event-based Structured Light Video (click on the image) This is the code for the 2021 3DV paper ESL: Event-based Structured Light by Manasi Mugli

Robotics and Perception Group 29 Oct 24, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
A script that trains a model to recognize handwritten digits using the MNIST data set.

handwritten-digits-recognition A script that trains a model to recognize handwritten digits using the MNIST data set. Then it loads external files and

Hamza Sayih 1 Oct 30, 2021
Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Recurrent Neural Networks Implements simple recurrent network and a stacked recurrent network in numpy and torch respectively. Both flavours implement

Vishal R 1 Nov 16, 2021
Reproduction process of AlexNet

PaddlePaddle论文复现杂谈 背景 注:该repo基于PaddlePaddle,对AlexNet进行复现。时间仓促,难免有所疏漏,如果问题或者想法,欢迎随时提issue一块交流。 飞桨论文复现赛地址:https://aistudio.baidu.com/aistudio/competitio

19 Nov 29, 2022
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

CSAW-M This repository contains code for CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer. Source code for tr

Yue Liu 7 Oct 11, 2022
Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera

67 Dec 05, 2022
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

Sagor Saha 4 Sep 04, 2021
Relative Human dataset, CVPR 2022

Relative Human (RH) contains multi-person in-the-wild RGB images with rich human annotations, including: Depth layers (DLs): relative depth relationsh

Yu Sun 112 Dec 02, 2022
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Chinese mandarin text to speech based on Fastspeech2 and Unet This is a modification and adpation of fastspeech2 to mandrin(普通话). Many modifications t

291 Jan 02, 2023
Using Hotel Data to predict High Value And Potential VIP Guests

Description Using hotel data and AI to predict high value guests and potential VIP guests. Hotel can leverage on prediction resutls to run more effect

HCG 12 Feb 14, 2022