EdiBERT, a generative model for image editing

Last update: Dec 07, 2022

Related tags

Overview

EdiBERT, a generative model for image editing

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The same EdiBERT model, derived from a single training, can be used on a wide variety of tasks.

We follow the implementation of Taming-Transformers (https://github.com/CompVis/taming-transformers). Main modifications can be found in: taming/models/bert_transformer.py ; scripts/sample_mask_likelihood_maximization.py.

Requirements

A suitable conda environment named edibert can be created and activated with:

conda env create -f environment.yaml
conda activate edibert

FFHQ

Download FFHQ dataset (https://github.com/NVlabs/ffhq-dataset) and put it into data/ffhq/.

Training BERT

In the logs/ folder, download and extract the FFHQ VQGAN:

gdown --id '1P_wHLRfdzf1DjsAH_tG10GXk9NKEZqTg'
tar -xvzf 2021-04-23T18-19-01_ffhq_vqgan.tar.gz

Training on 1 GPUs:

python main.py --base configs/ffhq_transformer_bert_2D.yaml -t True --gpus 0,

Training on 2 GPUs:

python main.py --base configs/ffhq_transformer_bert_2D.yaml -t True --gpus 0,1

Running pre-trained BERT on composite/scribble-edited images

In the logs/ folder, download and extract the FFHQ VQGAN:

gdown --id '1P_wHLRfdzf1DjsAH_tG10GXk9NKEZqTg'
tar -xvzf 2021-04-23T18-19-01_ffhq_vqgan.tar.gz

In the logs/ folder, download and extract the FFHQ BERT:

gdown --id '1YGDd8XyycKgBp_whs9v1rkYdYe4Oxfb3'
tar -xvzf 2021-10-14T16-32-28_ffhq_transformer_bert_2D.tar.gz

folders and place them into logs.

Then, launch the following script for composite images:

python scripts/sample_mask_likelihood_maximization.py -r logs/2021-10-14T16-32-28_ffhq_transformer_bert_2D/checkpoints/epoch=000019.ckpt \
--image_folder data/ffhq_collages/ --mask_folder data/ffhq_collages_masks/ --image_list data/ffhq_collages.txt --keep_img \
--dilation_sampling 1 -k 100 -t 1.0 --batch_size 5 --bert --epochs 2  \
--device 0 --random_order \
--mask_collage --collage_frequency 3 --gaussian_smoothing_collage

Then, launch the following script for edits images:

python scripts/sample_mask_likelihood_maximization.py -r logs/2021-10-14T16-32-28_ffhq_transformer_bert_2D/checkpoints/epoch=000019.ckpt \
--image_folder data/ffhq_edits/ --mask_folder data/ffhq_edits_masks/ --image_list data/ffhq_edits.txt --keep_img \
--dilation_sampling 1 -k 100 -t 1.0 --batch_size 5 --bert --epochs 2  \
--device 0 --random_order \
--mask_collage --collage_frequency 3 --gaussian_smoothing_collage

The samples can then be found in logs/my_model/samples/. Here, the --batch_size argument corresponds to the number of EdiBERT generations per image.

Notebooks for playing with completion/denoising with BERT

Notebooks for image denoising and image inpainting can also be found in the main folder.

EdiBERT, a generative model for image editing

Related tags

Overview

EdiBERT, a generative model for image editing

Requirements

FFHQ

Training BERT

Running pre-trained BERT on composite/scribble-edited images

Notebooks for playing with completion/denoising with BERT

Owner

DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

3.8% and 18.3% on CIFAR-10 and CIFAR-100

Lua-parser-lark - An out-of-box Lua parser written in Lark

Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Learning to Draw: Emergent Communication through Sketching

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Offical code for the paper: "Growing 3D Artefacts and Functional Machines with Neural Cellular Automata" https://arxiv.org/abs/2103.08737

Differentiable rasterization applied to 3D model simplification tasks

My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

NLP made easy

Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization

[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.