EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Last update: Dec 07, 2022

Related tags

Overview

EdiBERT, a generative model for image editing

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The same EdiBERT model, derived from a single training, can be used on a wide variety of tasks.

We follow the implementation of Taming-Transformers (https://github.com/CompVis/taming-transformers). Main modifications can be found in: taming/models/bert_transformer.py ; scripts/sample_mask_likelihood_maximization.py.

Requirements

A suitable conda environment named edibert can be created and activated with:

conda env create -f environment.yaml
conda activate edibert

FFHQ

Download FFHQ dataset (https://github.com/NVlabs/ffhq-dataset) and put it into data/ffhq/.

Training BERT

In the logs/ folder, download and extract the FFHQ VQGAN:

gdown --id '1P_wHLRfdzf1DjsAH_tG10GXk9NKEZqTg'
tar -xvzf 2021-04-23T18-19-01_ffhq_vqgan.tar.gz

Training on 1 GPUs:

python main.py --base configs/ffhq_transformer_bert_2D.yaml -t True --gpus 0,

Training on 2 GPUs:

python main.py --base configs/ffhq_transformer_bert_2D.yaml -t True --gpus 0,1

Running pre-trained BERT on composite/scribble-edited images

In the logs/ folder, download and extract the FFHQ VQGAN:

gdown --id '1P_wHLRfdzf1DjsAH_tG10GXk9NKEZqTg'
tar -xvzf 2021-04-23T18-19-01_ffhq_vqgan.tar.gz

In the logs/ folder, download and extract the FFHQ BERT:

gdown --id '1YGDd8XyycKgBp_whs9v1rkYdYe4Oxfb3'
tar -xvzf 2021-10-14T16-32-28_ffhq_transformer_bert_2D.tar.gz

folders and place them into logs.

Then, launch the following script for composite images:

python scripts/sample_mask_likelihood_maximization.py -r logs/2021-10-14T16-32-28_ffhq_transformer_bert_2D/checkpoints/epoch=000019.ckpt \
--image_folder data/ffhq_collages/ --mask_folder data/ffhq_collages_masks/ --image_list data/ffhq_collages.txt --keep_img \
--dilation_sampling 1 -k 100 -t 1.0 --batch_size 5 --bert --epochs 2  \
--device 0 --random_order \
--mask_collage --collage_frequency 3 --gaussian_smoothing_collage

Then, launch the following script for edits images:

python scripts/sample_mask_likelihood_maximization.py -r logs/2021-10-14T16-32-28_ffhq_transformer_bert_2D/checkpoints/epoch=000019.ckpt \
--image_folder data/ffhq_edits/ --mask_folder data/ffhq_edits_masks/ --image_list data/ffhq_edits.txt --keep_img \
--dilation_sampling 1 -k 100 -t 1.0 --batch_size 5 --bert --epochs 2  \
--device 0 --random_order \
--mask_collage --collage_frequency 3 --gaussian_smoothing_collage

The samples can then be found in logs/my_model/samples/. Here, the --batch_size argument corresponds to the number of EdiBERT generations per image.

Notebooks for playing with completion/denoising with BERT

Notebooks for image denoising and image inpainting can also be found in the main folder.

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Related tags

Overview

EdiBERT, a generative model for image editing

Requirements

FFHQ

Training BERT

Running pre-trained BERT on composite/scribble-edited images

Notebooks for playing with completion/denoising with BERT

Owner

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Make your first PR. A beginner friendly repository made specifically for open source beginners. Add any program under any language (it can be anything from a simple program to a complex data structure algorithm). Happy coding...

Attentive Implicit Representation Networks (AIR-Nets)

Code for KHGT model, AAAI2021

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

An LSTM for time-series classification

Code for GNMR in ICDE 2021

Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

use tensorflow 2.0 to tell a dog and cat from a specified picture

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021

D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

This git repo contains the implementation of my ML project on Heart Disease Prediction

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

An Artificial Intelligence trying to drive a car by itself on a user created map

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Related tags

Overview

EdiBERT, a generative model for image editing

Requirements

FFHQ

Training BERT

Running pre-trained BERT on composite/scribble-edited images

Notebooks for playing with completion/denoising with BERT

Owner

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Make your first PR. A beginner friendly repository made specifically for open source beginners. Add any program under any language (it can be anything from a simple program to a complex data structure algorithm). Happy coding...

Attentive Implicit Representation Networks (AIR-Nets)

Code for KHGT model, AAAI2021

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

An LSTM for time-series classification

Code for GNMR in ICDE 2021

Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

use tensorflow 2.0 to tell a dog and cat from a specified picture

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021

D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

This git repo contains the implementation of my ML project on Heart Disease Prediction

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

An Artificial Intelligence trying to drive a car by itself on a user created map

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.