Adversarial Framework for (non-) Parametric Image Stylisation Mosaics

Overview

Fully Adversarial Mosaics (FAMOS)

Pytorch implementation of the paper "Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Image Stylization" available at http://arxiv.org/abs/1811.09236.

This code allows to generate image stylisation using an adversarial approach combining parametric and non-parametric elements. Tested to work on Ubuntu 16.04, Pytorch 0.4, Python 3.6. Nvidia GPU p100. It is recommended to have a GPU with 12, 16GB, or more of VRAM.

Parameters

Our method has many possible settings. You can specify them with command-line parameters. The options parser that defines these parameters is in the config.py file and the options are parsed there. You are free to explore them and discover the functionality of FAMOS, which can cover a very broad range of image stylization settings.

There are 5 groups of parameter types:

  • data path and loading parameters
  • neural network parameters
  • regularization and loss criteria weighting parameters
  • optimization parameters
  • parameters of the stochastic noise -- see PSGAN

Update Febr. 2019: video frame-by-frame rendering supported

mosaicGAN.py can now render a whole folder of test images with the trained model. Example videos: lion video with Münich and Berlin

Just specify

python mosaicGAN.py --texturePath=samples/milano/ --contentPath=myFolder/ --testImage=myFolder/ 

with your myFolder and all images from that folder will be rendered by the generator of the GAN. Best to use the same test folder as content folder for training. To use in a video editing pipeline, save all video frames as images with a tool like AVIDEMUX, train FAMOS and save rendered frames, assemble again as video. Note: this my take some time to render thousands of images, you can edit in the code VIDEO_SAVE_FREQ to render the test image folder less frequently.

Update Jan. 2019: new functionality for texture synthesis

Due to interest in a new Pytorch implementation of our last paper "Texture Synthesis with Spatial Generative Adversarial Networks" (PSGAN) we added a script reimplementing it in the current repository. It shares many components with the texture mosaic stylization approach. A difference: PSGAN has no content image and loss, the generator is conditioned only on noise. Example call for texture synthesis:

python PSGAN.py --texturePath=samples/milano/ --ngf=120 --zLoc=50 --ndf=120 --nDep=5 --nDepD=5 --batchSize=16

In general, texture synthesis is much faster than the other methods in this repository, so feel free to add more channels and increase th batchsize. For more details and inspiration how to play with texture synthesis see our old repository with Lasagne code for PSGAN.

Usage: parametric convolutional adversarial mosaic

We provide scripts that have a main loop in which we (i) train an adversarial stylization model and (ii) save images (inference mode). If you need it, you can easily modify the code to save a trained model and load it later to do inference on many other images, see comments at the end of mosaicGAN.py.

In the simplest case, let us start an adversarial mosaic using convolutional networks. All you need is to specify the texture and content folders:

python mosaicGAN.py --texturePath=samples/milano/ --contentPath=samples/archimboldo/

This repository includes sample style files (4 satellite views of Milano, from Google Maps) and a portrait of Archimboldo (from the Google Art Project). Our GAN method will start running and training, occasionally saving results in "results/milano/archimboldo/" and printing the loss values to the terminal. Note that we use the first image found in contentPath as the default full-size output image stylization from FAMOS. You can also specify another image file name testImage to do out-of-sample stylization (inference).

This version uses DCGAN by default, which works nicely for the convolutional GAN we have here. Add the parameter LS for a least squares loss, which also works nicely. Interestingly, WGAN-GP is poorer for our model, which we did not investigate in detail.

If you want to tune the optimisation and model, you can adjust the layers and channels of the Generator and Discriminator, and also choose imageSize and batchSize. All this will effect the speed and performance of the model. You can also tweak the correspondance map cLoss and the content loss weighting fContent

python mosaicGAN.py --texturePath=samples/milano/ --contentPath=samples/archimboldo/ --imageSize=192 --batchSize=8 --ngf=80 --ndf=80  --nDepD=5  --nDep=4 --cLoss=101 --fContent=.6

Other interesting options are skipConnections and Ubottleneck. By disabling the skip connections of the Unet and defining a smaller bottleneck we can reduce the effect of the content image and emphasize more the texture style of the output.

Usage: the full FAMOS approach with parametric and non-parametric aspects

Our method has the property of being able to copy pixels from template images together with the convolutional generation of the previous section.

python mosaicFAMOS.py  --texturePath=samples/milano/ --contentPath=samples/archimboldo/ --N=80 --mirror=True --dIter=2 --WGAN=True

Here we specify N=80 memory templates to copy from. In addition, we use mirror augmentation to get nice kaleidoscope-like effects in the template (and texture distribution). We use the WGAN GAN criterium, which works better for the combined parametric/non-parametric case (experimenting with the usage of DCGAN and WGAN depending on the architecture is advised). We set to use dIter=2 D steps for each G step.

The code also supports a slightly more complicated implementation than the one described in the paper. By setting multiScale=True a mixed template of images I_M on multiple levels of the Unet is used. In addition, by setting nBlocks=2 we will add residual layers to the decoder of the Unet, for a model with even higher capacity. Finally, you can also set refine=True and add a second Unet to refine the results of the first one. Of course, all these additional layers come at a computational cost -- selecting the layer depth, channel width, and the use of all these additional modules is a matter of trade-off and experimenting.

python mosaicFAMOS.py  --texturePath=samples/milano/ --contentPath=samples/archimboldo/ --N=80 --mirror=True --multiScale=True --nBlocks=1 --dIter=2 --WGAN=True

The method will save mosaics occasionally, and optionally you can specify a testImage (size smaller than the initial content image) to check out-of-sample performance. You can check the patches image saved regularly how the patch based training proceeds. The files has a column per batch-instance, and 6 rows showing the quantities from the paper:

  • I_C content patch
  • I_M mixed template patch on highest scale
  • I_G parametric generation component
  • I blended patch
  • \alpha blending mask
  • A mixing matrix

License

Please make sure to cite/acknowledge our paper, if you use any of the contained code in your own projects or publication.

The MIT License (MIT)

Copyright © 2018 Zalando SE, https://tech.zalando.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Owner
Zalando Research
Repositories of the research branch of Zalando SE
Zalando Research
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration

This repo is for the paper: Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration The DAC environment is based on the Dynam

Carola Doerr 1 Aug 19, 2022
A minimalist implementation of score-based diffusion model

sdeflow-light This is a minimalist codebase for training score-based diffusion models (supporting MNIST and CIFAR-10) used in the following paper "A V

Chin-Wei Huang 89 Dec 20, 2022
Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

Nest Protect integration for Home Assistant Custom component for Home Assistant to interact with Nest Protect devices via an undocumented and unoffici

Mick Vleeshouwer 175 Dec 29, 2022
This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

Awesome-Projects-Collection Quality over Quantity :) What to do? Add some unique and amazing projects as per your favourite tech stack for the communi

Rohan Sharma 178 Jan 01, 2023
An implementation of the AdaOPS (Adaptive Online Packing-based Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface.

AdaOPS An implementation of the AdaOPS (Adaptive Online Packing-guided Search), which is an online POMDP Solver used to solve problems defined with th

9 Oct 05, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).

Huynh Ngoc Anh 1.7k Dec 24, 2022
A Quick and Dirty Progressive Neural Network written in TensorFlow.

prog_nn .▄▄ · ▄· ▄▌ ▐ ▄ ▄▄▄· ▐ ▄ ▐█ ▀. ▐█▪██▌•█▌▐█▐█ ▄█▪ •█▌▐█ ▄▀▀▀█▄▐█▌▐█▪▐█▐▐▌ ██▀

SynPon 53 Dec 12, 2022
Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

James Lin 101 Dec 13, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 309 Dec 30, 2022
A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

faceswap-GAN Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture. Updates Date Update 2018-08-2

3.2k Dec 30, 2022
Continuous Time LiDAR odometry

CT-ICP: Elastic SLAM for LiDAR sensors This repository implements the SLAM CT-ICP (see our article), a lightweight, precise and versatile pure LiDAR o

385 Dec 29, 2022
KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems This is the implementation of the paper: Learning Knowledge Bases with Par

CAiRE 42 Nov 10, 2022
Visual dialog agents with pre-trained vision-and-language encoders.

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation Or READ-UP: Referring Expression Agent Dialog with Unified Pretr

7 Oct 08, 2022
Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis

Xiaodong Gu 67 Jan 06, 2023
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Winnie Xu 95 Nov 26, 2021
DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.

DWIPrep: A Robust Preprocessing Pipeline for dMRI Data DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transp

Gal Ben-Zvi 1 Jan 09, 2023
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

Fangzheng (Andy) Sun 8 Jul 13, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023