GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

Overview

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions.

Python 3.7.3 PyTorch 1.8.1 Apache-2.0

cxx1 cxx2 msk dy zy

This is the official code release for "Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions".

The code contains a set of encoders that match pre-trained GANs (PGGAN, StyleGANv1, StyleGANv2, BigGAN) via multi-scale vectors with two-scale attentions.

Usage

  • training encoder with center attentions (align image)

python E_align.py

  • training encoder with Gram-based attentions (misalign image)

python E_mis_align.py

  • embedding real images to latent space (using StyleGANv1 and w).

    a. You can put real images at './checkpoint/realimg_file/' (default file as args.img_dir)

    b. You should load pre-trained Encoder at './checkpoint/E/E_blur(case2)_styleganv1_FFHQ_state_dict.pth'

    c. Then run:

python embedding_img.py

  • discovering attribute directions with latent space : embedded_img_processing.py

Note: Pre-trained Model should be download first , and default save to './chechpoint/'

Metric

  • validate performance (Pre-trained GANs and baseline)

    1. using generations.py to generate reconstructed images (generate GANs images if needed)
    2. Files in the directory "./baseline/" could help you to quickly format images and latent vectors (w).
    3. Put comparing images to different files, and run comparing-baseline.py
  • ablation study : look at ''./ablations-study/''

Setup

Encoders

  • Case 1: Training most pre-trained GANs with encoders. at './model/E/E.py' (quickly converge for reconstructed GANs' image)
  • Case 2: Training StyleGANv1 on FFHQ for ablation study and real face image process at './model/E/E_Blur.py' (margin blur and more GPU memory)

Pre-Trained GANs

note: put pre-trained GANs weight file at ''./checkpoint/' directory

  • StyleGAN_V1 (should contain 3 files: Gm, Gs, center-tensor):
    • Cat 256:
      • ./checkpoint/stylegan_V1/cat/cat256_Gs_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_Gm_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_tensor.pt
    • Car 256: same above
    • Bedroom 256:
  • StyleGAN_V2 (Only one files : pth):
    • FFHQ 1024:
      • ./checkpoint/stylegan_V2/stylegan2_ffhq1024.pth
  • PGGAN ((Only one files : pth)):
    • Horse 256:
      • ./checkpoint/PGGAN/
  • BigGAN (Two files : model as .pt and config as .json ):
    • Image-Net 256:
      • ./checkpoint/biggan/256/G-256.pt
      • ./checkpoint/biggan/256/biggan-deep-256-config.json

Options and Setting

note: different GANs should set different parameters carefully.

  • choose --mtype for StyleGANv1=1, StyleGANv2=2, PGGAN=3, BIGGAN=4
  • choose Encoder start_features (--z_dim) carefully, the value are: 16->1024x1024, 32->512x512, 64->256x256
  • if go on training, set --checkpoint_dir_E which path save pre-trained Encoder model
  • --checkpoint_dir_GAN is needed, StyleGANv1 is a directory(contains 3 filers: Gm, Gs, center-tensor) , others are file path (.pth or .pt)
    parser = argparse.ArgumentParser(description='the training args')
    parser.add_argument('--iterations', type=int, default=210000) # epoch = iterations//30000
    parser.add_argument('--lr', type=float, default=0.0015)
    parser.add_argument('--beta_1', type=float, default=0.0)
    parser.add_argument('--batch_size', type=int, default=2)
    parser.add_argument('--experiment_dir', default=None) #None
    parser.add_argument('--checkpoint_dir_GAN', default='./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth') #None  ./checkpoint/stylegan_v1/ffhq1024/ or ./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth or ./checkpoint/biggan/256/G-256.pt
    parser.add_argument('--config_dir', default='./checkpoint/biggan/256/biggan-deep-256-config.json') # BigGAN needs it
    parser.add_argument('--checkpoint_dir_E', default=None)
    parser.add_argument('--img_size',type=int, default=1024)
    parser.add_argument('--img_channels', type=int, default=3)# RGB:3 ,L:1
    parser.add_argument('--z_dim', type=int, default=512) # PGGAN , StyleGANs are 512. BIGGAN is 128
    parser.add_argument('--mtype', type=int, default=2) # StyleGANv1=1, StyleGANv2=2, PGGAN=3, BigGAN=4
    parser.add_argument('--start_features', type=int, default=16)  # 16->1024 32->512 64->256

Pre-trained Model

We offered pre-trainned GANs and their corresponding encoders here: models (default setting is the case1 ).

GANs:

  • StyleGANv1-(FFHQ1024, Car512, Cat256) models which contain 3 files Gm, Gs and center-tensor.
  • PGGAN and StyleGANv2. A single .pth file gets Gm, Gs and center-tensor together.
  • BigGAN 128x128 ,256x256, and 512x512: each type contain a config file and model (.pt)

Encoders:

  • StyleGANv1 FFHQ (case 2) for real-image embedding and process.
  • StyleGANv2 LSUN Cat 256, they are one models from case 1 (Grad-CAM based attentions) and both models from case 2 (Grad-Cam based and Center-aligned Attentions for ablation study):
  • StyleGANv2 FFHQ (case 1)
  • Biggan-256 (case 1)

If you want to try more GANs, cite more pre-trained GANs below:

Acknowledgements

Pre-trained GANs:

StyleGANv1: https://github.com/podgorskiy/StyleGan.git, ( Converting code for official pre-trained model is here: https://github.com/podgorskiy/StyleGAN_Blobless.git) StyleGANv2 and PGGAN: https://github.com/genforce/genforce.git BigGAN: https://github.com/huggingface/pytorch-pretrained-BigGAN

Comparing Works:

In-Domain GAN: https://github.com/genforce/idinvert_pytorch pSp: https://github.com/eladrich/pixel2style2pixel ALAE: https://github.com/podgorskiy/ALAE.git

Related Works:

Grad-CAM & Grad-CAM++: https://github.com/yizt/Grad-CAM.pytorch SSIM Index: https://github.com/Po-Hsun-Su/pytorch-ssim

Our method implementation partly borrow from the above works (ALAE and Related Works). We would like to thank those authors.

If you have any questions, please contact us by E-mail ( [email protected]). Pull request or any comment is also welcome.

License

The code of this repository is released under the Apache 2.0 license.
The directories models/biggan and models/stylegan2 are provided under the MIT license.

Cite

@misc{yu2021adaptable,
      title={Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions}, 
      author={Cheng Yu and Wenmin Wang},
      year={2021},
      eprint={2108.10201},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

简体中文:

如何应用于编辑人脸

Owner
owl
Be a strong man & Try to be a great man
owl
Highly comparative time-series analysis

〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu

Ben Fulcher 569 Dec 21, 2022
Tensorflow Implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (ICML 2017 workshop)

tf-SNDCGAN Tensorflow implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (https://www.researchgate.net/publicati

Nhat M. Nguyen 248 Nov 25, 2022
Learnable Motion Coherence for Correspondence Pruning

Learnable Motion Coherence for Correspondence Pruning Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang Project Page Any questions or discussi

liuyuan 41 Nov 30, 2022
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d

IR Group at Technische Hochschule Köln 9 May 25, 2022
High dimensional black-box optimizer using Latent Action Monte Carlo Tree Search algorithm

LA-MCTS The code is based of paper Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search. Component LA-MCTS has thr

Meta Research 18 Oct 24, 2022
Flybirds - BDD-driven natural language automated testing framework, present by Trip Flight

Flybird | English Version 行为驱动开发(Behavior-driven development,缩写BDD),是一种软件过程的思想或者

Ctrip, Inc. 706 Dec 30, 2022
Single Image Super-Resolution (SISR) with SRResNet, EDSR and SRGAN

Single Image Super-Resolution (SISR) with SRResNet, EDSR and SRGAN Introduction Image super-resolution (SR) is the process of recovering high-resoluti

8 Apr 15, 2022
RoIAlign & crop_and_resize for PyTorch

RoIAlign for PyTorch This is a PyTorch version of RoIAlign. This implementation is based on crop_and_resize and supports both forward and backward on

Long Chen 530 Jan 07, 2023
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

Tutoriais Públicos Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado finan

Trading com Dados 68 Oct 15, 2022
Additional functionality for use with fastai’s medical imaging module

fmi Adding additional functionality to fastai's medical imaging module To learn more about medical imaging using Fastai you can view my blog Install g

14 Oct 31, 2022
Vision Transformer for 3D medical image registration (Pytorch).

ViT-V-Net: Vision Transformer for Volumetric Medical Image Registration keywords: vision transformer, convolutional neural networks, image registratio

Junyu Chen 192 Dec 20, 2022
A semismooth Newton method for elliptic PDE-constrained optimization

sNewton4PDEOpt The Python module implements a semismooth Newton method for solving finite-element discretizations of the strongly convex, linear ellip

2 Dec 08, 2022
Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

S2VD Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021) Requirements and Dependencies Ubuntu 16.04, cuda 10.0 Python 3.6.10, P

Zongsheng Yue 53 Nov 23, 2022
The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.

Object-Placement-Assessment-Dataset-OPA Object-Placement-Assessment (OPA) is to verify whether a composite image is plausible in terms of the object p

BCMI 53 Nov 15, 2022
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Yan Yuanmeng 478 Dec 25, 2022
Baselines for TrajNet++

TrajNet++ : The Trajectory Forecasting Framework PyTorch implementation of Human Trajectory Forecasting in Crowds: A Deep Learning Perspective TrajNet

VITA lab at EPFL 183 Jan 05, 2023
The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

SiamTracker-with-TensorRT The modify PyTorch version of Siam-trackers which are speed-up by TensorRT or ONNX. [Updating...] Examples demonstrating how

9 Dec 13, 2022
Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

[PDF] | [Slides] The official implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021 Long talk) Installation Inst

MilaGraph 117 Dec 09, 2022
Sequence to Sequence Models with PyTorch

Sequence to Sequence models with PyTorch This repository contains implementations of Sequence to Sequence (Seq2Seq) models in PyTorch At present it ha

Sandeep Subramanian 708 Dec 19, 2022