Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Overview

人像卡通化 (Photo to Cartoon)

中文版 | English Version

该项目为小视科技卡通肖像探索项目。您可使用微信扫描下方二维码或搜索“AI卡通秀”小程序体验卡通化效果。

也可以前往我们的ai开放平台进行在线体验:https://ai.minivision.cn/#/coreability/cartoon

技术交流QQ群:937627932

Updates

简介

人像卡通风格渲染的目标是,在保持原图像ID信息和纹理细节的同时,将真实照片转换为卡通风格的非真实感图像。我们的思路是,从大量照片/卡通数据中习得照片到卡通画的映射。一般而言,基于成对数据的pix2pix方法能达到较好的图像转换效果,但本任务的输入输出轮廓并非一一对应,例如卡通风格的眼睛更大、下巴更瘦;且成对的数据绘制难度大、成本较高,因此我们采用unpaired image translation方法来实现。

Unpaired image translation流派最经典方法是CycleGAN,但原始CycleGAN的生成结果往往存在较为明显的伪影且不稳定。近期的论文U-GAT-IT提出了一种归一化方法——AdaLIN,能够自动调节Instance Norm和Layer Norm的比重,再结合attention机制能够实现精美的人像日漫风格转换。

与夸张的日漫风不同,我们的卡通风格更偏写实,要求既有卡通画的简洁Q萌,又有明确的身份信息。为此我们增加了Face ID Loss,使用预训练的人脸识别模型提取照片和卡通画的ID特征,通过余弦距离来约束生成的卡通画。

此外,我们提出了一种Soft-AdaLIN(Soft Adaptive Layer-Instance Normalization)归一化方法,在反规范化时将编码器的均值方差(照片特征)与解码器的均值方差(卡通特征)相融合。

模型结构方面,在U-GAT-IT的基础上,我们在编码器之前和解码器之后各增加了2个hourglass模块,渐进地提升模型特征抽象和重建能力。

由于实验数据较为匮乏,为了降低训练难度,我们将数据处理成固定的模式。首先检测图像中的人脸及关键点,根据人脸关键点旋转校正图像,并按统一标准裁剪,再将裁剪后的头像输入人像分割模型去除背景。

Start

安装依赖库

项目所需的主要依赖库如下:

  • python 3.6
  • pytorch 1.4
  • tensorflow-gpu 1.14
  • face-alignment
  • dlib
  • onnxruntime

Clone:

git clone https://github.com/minivision-ai/photo2cartoon.git
cd ./photo2cartoon

下载资源

谷歌网盘 | 百度网盘 提取码:y2ch

  1. 人像卡通化预训练模型:photo2cartoon_weights.pt(20200504更新),存放在models路径下。
  2. 头像分割模型:seg_model_384.pb,存放在utils路径下。
  3. 人脸识别预训练模型:model_mobilefacenet.pth,存放在models路径下。(From: InsightFace_Pytorch
  4. 卡通画开源数据:cartoon_data,包含trainBtestB
  5. 人像卡通化onnx模型:photo2cartoon_weights.onnx 谷歌网盘,存放在models路径下。

测试

将一张测试照片(亚洲年轻女性)转换为卡通风格:

python test.py --photo_path ./images/photo_test.jpg --save_path ./images/cartoon_result.png

测试onnx模型

python test_onnx.py --photo_path ./images/photo_test.jpg --save_path ./images/cartoon_result.png

训练

1.数据准备

训练数据包括真实照片和卡通画像,为降低训练复杂度,我们对两类数据进行了如下预处理:

  • 检测人脸及关键点。
  • 根据关键点旋转校正人脸。
  • 将关键点边界框按固定的比例扩张并裁剪出人脸区域。
  • 使用人像分割模型将背景置白。

我们开源了204张处理后的卡通画数据,您还需准备约1000张人像照片(为匹配卡通数据,尽量使用亚洲年轻女性照片,人脸大小最好超过200x200像素),使用以下命令进行预处理:

python data_process.py --data_path YourPhotoFolderPath --save_path YourSaveFolderPath

将处理后的数据按照以下层级存放,trainAtestA中存放照片头像数据,trainBtestB中存放卡通头像数据。

├── dataset
    └── photo2cartoon
        ├── trainA
            ├── xxx.jpg
            ├── yyy.png
            └── ...
        ├── trainB
            ├── zzz.jpg
            ├── www.png
            └── ...
        ├── testA
            ├── aaa.jpg 
            ├── bbb.png
            └── ...
        └── testB
            ├── ccc.jpg 
            ├── ddd.png
            └── ...

2.训练

重新训练:

python train.py --dataset photo2cartoon

加载预训练参数:

python train.py --dataset photo2cartoon --pretrained_weights models/photo2cartoon_weights.pt

多GPU训练(仍建议使用batch_size=1,单卡训练):

python train.py --dataset photo2cartoon --batch_size 4 --gpu_ids 0 1 2 3

Q&A

Q:为什么开源的卡通化模型与小程序中的效果有差异?

A:开源模型的训练数据收集自互联网,为了得到更加精美的效果,我们在训练小程序中卡通化模型时,采用了定制的卡通画数据(200多张),且增大了输入分辨率。此外,小程序中的人脸特征提取器采用自研的识别模型,效果优于本项目使用的开源识别模型。

Q:如何选取效果最好的模型?

A:首先训练模型200k iterations,然后使用FID指标挑选出最优模型,最终挑选出的模型为迭代90k iterations时的模型。

Q:关于人脸特征提取模型。

A:实验中我们发现,使用自研的识别模型计算Face ID Loss训练效果远好于使用开源识别模型,若训练效果出现鲁棒性问题,可尝试将Face ID Loss权重置零。

Q:人像分割模型是否能用与分割半身像?

A:不能。该模型是针对本项目训练的专用模型,需先裁剪出人脸区域再输入。

Tips

我们开源的模型是基于亚洲年轻女性训练的,对于其他人群覆盖不足,您可根据使用场景自行收集相应人群的数据进行训练。我们的开放平台提供了能够覆盖各类人群的卡通化服务,您可前往体验。如有定制卡通风格需求请联系商务:18852075216。

参考

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation [Paper][Code]

InsightFace_Pytorch

Owner
Minivision_AI
Minivision_AI
Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)" which introduces a new class of deep generative models that gene

Guan-Horng Liu 43 Jan 03, 2023
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 05, 2023
[ICCV '21] In this repository you find the code to our paper Keypoint Communities

Keypoint Communities In this repository you will find the code to our ICCV '21 paper: Keypoint Communities Duncan Zauss, Sven Kreiss, Alexandre Alahi,

Duncan Zauss 262 Dec 13, 2022
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning This is the official repository for Conservative and Adaptive Penalty fo

7 Nov 22, 2022
Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

THUML @ Tsinghua University 221 Dec 31, 2022
Acute ischemic stroke dataset

AISD Acute ischemic stroke dataset contains 397 Non-Contrast-enhanced CT (NCCT) scans of acute ischemic stroke with the interval from symptom onset to

Kongming Liang 21 Sep 06, 2022
Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

Scene Graph Generation Object Detections Ground truth Scene Graph Generated Scene Graph In this visualization, woman sitting on rock is a zero-shot tr

Boris Knyazev 93 Dec 28, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
ReAct: Out-of-distribution Detection With Rectified Activations

ReAct: Out-of-distribution Detection With Rectified Activations This is the source code for paper ReAct: Out-of-distribution Detection With Rectified

38 Dec 05, 2022
Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks Under construction. Description Code for Phase diagram of S

Rodrigo Veiga 3 Nov 24, 2022
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

ISC21-Descriptor-Track-1st The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track. You can check our solution

lyakaap 73 Dec 24, 2022
A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

MADGRAD Optimization Algorithm For Tensorflow This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized,

20 Aug 18, 2022
Turn based roguelike in python

pyTB Turn based roguelike in python Documentation can be found here: http://mcgillij.github.io/pyTB/index.html Screenshot Dependencies Written in Pyth

Jason McGillivray 4 Sep 29, 2022
Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the

Ofir Press 138 Apr 15, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch = 0.4.0 (with suitable CUDA and CuDNN version) tor

23 Dec 07, 2022
Fast, flexible and fun neural networks.

Brainstorm Discontinuation Notice Brainstorm is no longer being maintained, so we recommend using one of the many other,available frameworks, such as

IDSIA 1.3k Nov 21, 2022
Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision Training Efficiency We show the training efficiency of our DSLP model b

Chenyang Huang 36 Oct 31, 2022
My take on a practical implementation of Linformer for Pytorch.

Linformer Pytorch Implementation A practical implementation of the Linformer paper. This is attention with only linear complexity in n, allowing for v

Peter 349 Dec 25, 2022
TorchXRayVision: A library of chest X-ray datasets and models.

torchxrayvision A library for chest X-ray datasets and models. Including pre-trained models. ( 🎬 promo video about the project) Motivation: While the

Machine Learning and Medicine Lab 575 Jan 08, 2023