Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Last update: Dec 09, 2022

Related tags

Deep Learning latent-transformer

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

[Video Editing Results]

Requirements

Dependencies

Python 3.6
PyTorch 1.8
Opencv
Tensorboard_logger

You can install a new environment for this repo by running

conda env create -f environment.yml
conda activate lattrans

Prepare StyleGAN2 encoder and generator

We use the pretrained StyleGAN2 encoder and generator released from paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. Download and save the official implementation to pixel2style2pixel/ directory. Download and save the pretrained model to pixel2style2pixel/pretrained_models/.

In order to save the latent codes to the designed path, we slightly modify pixel2style2pixel/scripts/inference.py.

# modify run_on_batch()
if opts.latent_mask is None:
    result_batch = net(inputs, randomize_noise=False, resize=opts.resize_outputs, return_latents=True)
    
# modify run()
tic = time.time()
result_batch, latent_batch = run_on_batch(input_cuda, net, opts) 
latent_save_path = os.path.join(test_opts.exp_dir, 'latent_code_%05d.npy'%global_i)
np.save(latent_save_path, latent_batch.cpu().numpy())
toc = time.time()

Training

Prepare the training data

To train the latent transformers, you can download our prepared dataset to the directory data/ and the pretrained latent classifier to the directory models/.
```
sh download.sh
```
You can also prepare your own training data. To achieve that, you need to map your dataset to latent codes using the StyleGAN2 encoder. The corresponding label file is also required. You can continue to use our pretrained latent classifier. If you want to train your own latent classifier on new labels, you can use pretraining/latent_classifier.py.
Training

You can modify the training options of the config file in the directory configs/.
```
python train.py --config 001 
```

Testing

Single Attribute Manipulation

Make sure that the latent classifier is downloaded to the directory models/ and the StyleGAN2 encoder is prepared as required. After training your latent transformers, you can use test.py to run the latent transformer for the images in the test directory data/test/. We also provide several pretrained models here (run download.sh to download them). The output images will be saved in the folder outputs/. You can change the desired attribute with --attr.

python test.py --config 001 --attr Eyeglasses --out_path ./outputs/

If you want to test the model on your custom images, you need to first encoder the images to the latent space of StyleGAN using the pretrained encoder.

cd pixel2style2pixel/
python scripts/inference.py \
--checkpoint_path=pretrained_models/psp_ffhq_encode.pt \
--data_path=../data/test/ \
--exp_dir=../data/test/ \
--test_batch_size=1

Sequential Attribute Manipulation

You can reproduce the sequential editing results in the paper using notebooks/figure_sequential_edit.ipynb and the results in the supplementary material using notebooks/figure_supplementary.ipynb.

We also provide an interactive visualization notebooks/visu_manipulation.ipynb, where the user can choose the desired attributes for manipulation and define the magnitude of edit for each attribute.

Video Manipulation

We provide a script to achieve attribute manipulation for the videos in the test directory data/video/. Please ensure that the StyleGAN2 encoder is prepared as required. You can upload your own video and modify the options in run_video_manip.sh. You can view our video editing results presented in the paper.

sh run_video_manip.sh

Citation

@article{yao2021latent,
  title={A Latent Transformer for Disentangled Face Editing in Images and Videos},
  author={Yao, Xu and Newson, Alasdair and Gousseau, Yann and Hellier, Pierre},
  journal={2021 International Conference on Computer Vision},
  year={2021}
}

License

This source code is made available under the license found in the LICENSE.txt in the root directory of this source tree.

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Related tags

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Requirements

Dependencies

Prepare StyleGAN2 encoder and generator

Training

Testing

Single Attribute Manipulation

Sequential Attribute Manipulation

Video Manipulation

Citation

License

Owner

InterDigital

Markov Attention Models

yolov5 deepsort 行人车辆跟踪检测计数

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

High level network definitions with pre-trained weights in TensorFlow

Charsiu: A transformer-based phonetic aligner

Lane assist for ETS2, built with the ultra-fast-lane-detection model.

This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

A dataset for online Arabic calligraphy

Framework for evaluating ANNS algorithms on billion scale datasets.

TensorFlow-based neural network library

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Unsupervised Foreground Extraction via Deep Region Competition

Node Dependent Local Smoothing for Scalable Graph Learning

LUKE -- Language Understanding with Knowledge-based Embeddings

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Related tags

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Requirements

Dependencies

Prepare StyleGAN2 encoder and generator

Training

Testing

Single Attribute Manipulation

Sequential Attribute Manipulation

Video Manipulation

Citation

License

Owner

InterDigital

Markov Attention Models

yolov5 deepsort 行人 车辆 跟踪 检测 计数

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

High level network definitions with pre-trained weights in TensorFlow

Charsiu: A transformer-based phonetic aligner

Lane assist for ETS2, built with the ultra-fast-lane-detection model.

This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

A dataset for online Arabic calligraphy

Framework for evaluating ANNS algorithms on billion scale datasets.

TensorFlow-based neural network library

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Official PyTorch implementation of "Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning" (ICCV2021 Oral)

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Unsupervised Foreground Extraction via Deep Region Competition

Node Dependent Local Smoothing for Scalable Graph Learning

LUKE -- Language Understanding with Knowledge-based Embeddings

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size

yolov5 deepsort 行人车辆跟踪检测计数