A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

Overview

faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture.

Updates

Date    Update
2018-08-27     Colab support: A colab notebook for faceswap-GAN v2.2 is provided.
2018-07-25     Data preparation: Add a new notebook for video pre-processing in which MTCNN is used for face detection as well as face alignment.
2018-06-29     Model architecture: faceswap-GAN v2.2 now supports different output resolutions: 64x64, 128x128, and 256x256. Default RESOLUTION = 64 can be changed in the config cell of v2.2 notebook.
2018-06-25     New version: faceswap-GAN v2.2 has been released. The main improvements of v2.2 model are its capability of generating realistic and consistent eye movements (results are shown below, or Ctrl+F for eyes), as well as higher video quality with face alignment.
2018-06-06     Model architecture: Add a self-attention mechanism proposed in SAGAN into V2 GAN model. (Note: There is still no official code release for SAGAN, the implementation in this repo. could be wrong. We'll keep an eye on it.)

Google Colab support

Here is a playground notebook for faceswap-GAN v2.2 on Google Colab. Users can train their own model in the browser.

[Update 2019/10/04] There seems to be import errors in the latest Colab environment due to inconsistent version of packages. Please make sure that the Keras and TensorFlow follow the version number shown in the requirement section below.

Descriptions

faceswap-GAN v2.2

  • FaceSwap_GAN_v2.2_train_test.ipynb

    • Notebook for model training of faceswap-GAN model version 2.2.
    • This notebook also provides code for still image transformation at the bottom.
    • Require additional training images generated through prep_binary_masks.ipynb.
  • FaceSwap_GAN_v2.2_video_conversion.ipynb

    • Notebook for video conversion of faceswap-GAN model version 2.2.
    • Face alignment using 5-points landmarks is introduced to video conversion.
  • prep_binary_masks.ipynb

    • Notebook for training data preprocessing. Output binary masks are save in ./binary_masks/faceA_eyes and ./binary_masks/faceB_eyes folders.
    • Require face_alignment package. (An alternative method for generating binary masks (not requiring face_alignment and dlib packages) can be found in MTCNN_video_face_detection_alignment.ipynb.)
  • MTCNN_video_face_detection_alignment.ipynb

    • This notebook performs face detection/alignment on the input video.
    • Detected faces are saved in ./faces/raw_faces and ./faces/aligned_faces for non-aligned/aligned results respectively.
    • Crude eyes binary masks are also generated and saved in ./faces/binary_masks_eyes. These binary masks can serve as a suboptimal alternative to masks generated through prep_binary_masks.ipynb.

Usage

  1. Run MTCNN_video_face_detection_alignment.ipynb to extract faces from videos. Manually move/rename the aligned face images into ./faceA/ or ./faceB/ folders.
  2. Run prep_binary_masks.ipynb to generate binary masks of training images.
    • You can skip this pre-processing step by (1) setting use_bm_eyes=False in the config cell of the train_test notebook, or (2) use low-quality binary masks generated in step 1.
  3. Run FaceSwap_GAN_v2.2_train_test.ipynb to train models.
  4. Run FaceSwap_GAN_v2.2_video_conversion.ipynb to create videos using the trained models in step 3.

Miscellaneous

Training data format

  • Face images are supposed to be in ./faceA/ or ./faceB/ folder for each taeget respectively.
  • Images will be resized to 256x256 during training.

Generative adversarial networks for face swapping

1. Architecture

enc_arch3d

dec_arch3d

dis_arch3d

2. Results

  • Improved output quality: Adversarial loss improves reconstruction quality of generated images. trump_cage

  • Additional results: This image shows 160 random results generated by v2 GAN with self-attention mechanism (image format: source -> mask -> transformed).

  • Evaluations: Evaluations of the output quality on Trump/Cage dataset can be found here.

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

  • VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face. It also smoothes out artifacts in the segmentation mask, resulting higher output quality.

  • Attention mask: Model predicts an attention mask that helps on handling occlusion, eliminating artifacts, and producing natrual skin tone.

  • Configurable input/output resolution (v2.2): The model supports 64x64, 128x128, and 256x256 outupt resolutions.

  • Face tracking/alignment using MTCNN and Kalman filter in video conversion:

    • MTCNN is introduced for more stable detections and reliable face alignment (FA).
    • Kalman filter smoothen the bounding box positions over frames and eliminate jitter on the swapped face. comp_FA
  • Eyes-aware training: Introduce high reconstruction loss and edge loss in eyes area, which guides the model to generate realistic eyes.

Frequently asked questions and troubleshooting

1. How does it work?

  • The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this. flow_chart

2. Previews look good, but it does not transform to the output videos?

  • Model performs its full potential when the input images are preprocessed with face alignment methods.
    • readme_note001

Requirements

Acknowledgments

Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and reddit user deepfakes' project. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

Wentao Xu 7 Nov 13, 2022
CONditionals for Ordinal Regression and classification in PyTorch

CONDOR pytorch implementation for ordinal regression with deep neural networks. Documentation: https://GarrettJenkinson.github.io/condor_pytorch About

7 Jul 25, 2022
PlenOctree Extraction algorithm

PlenOctrees_NeRF-SH This is an implementation of the Paper PlenOctrees for Real-time Rendering of Neural Radiance Fields. Not only the code provides t

49 Nov 05, 2022
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

Multimedia Research 196 Dec 13, 2022
Applying PVT to Semantic Segmentation

Applying PVT to Semantic Segmentation Here, we take MMSegmentation v0.13.0 as an example, applying PVTv2 to SemanticFPN. For details see Pyramid Visio

35 Nov 30, 2022
an implementation of softmax splatting for differentiable forward warping using PyTorch

softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I

Simon Niklaus 338 Dec 28, 2022
Project to create an open-source 6 DoF input device

6DInputs A Project to create open-source 3D printed 6 DoF input devices Note the plural ('6DInputs' and 'devices') in the headings. We would like seve

RepRap Ltd 47 Jul 28, 2022
Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

40 Dec 22, 2022
Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression. Not an official Google product. Me

Google Research 27 Dec 12, 2022
Breast cancer is been classified into benign tumour and malignant tumour.

Breast cancer is been classified into benign tumour and malignant tumour. Logistic regression is applied in this model.

1 Feb 04, 2022
Train a deep learning net with OpenStreetMap features and satellite imagery.

DeepOSM Classify roads and features in satellite imagery, by training neural networks with OpenStreetMap (OSM) data. DeepOSM can: Download a chunk of

TrailBehind, Inc. 1.3k Nov 24, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Example of wrapping SPL token by ERC2-20 interface in Neon Requirements Install

7 Mar 28, 2022
Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022
A python module for configuration of block devices

Blivet is a python module for system storage configuration. CI status Licence See COPYING Installation From Fedora repositories Blivet is available in

78 Dec 14, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022
Pytorch implementation of the paper Time-series Generative Adversarial Networks

TimeGAN-pytorch Pytorch implementation of the paper Time-series Generative Adversarial Networks presented at NeurIPS'19. Jinsung Yoon, Daniel Jarrett

Zhiwei ZHANG 21 Nov 24, 2022
RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

ML-and-DataScience-preparation This repository has the goal to create a learning and preparation roadMap for Machine Learning Engineers and Data Scien

33 Dec 29, 2022
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

郝翔 357 Jan 04, 2023