Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Last update: Jan 06, 2023

Related tags

Deep Learning ddpm-segmentation

Overview

Label-Efficient Semantic Segmentation with Diffusion Models

Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

This code is based on datasetGAN and guided-diffusion.

Note: use --recurse-submodules when clone.

Overview

The paper investigates the representations learned by the state-of-the-art DDPMs and shows that they capture high-level semantic information valuable for downstream vision tasks. We design a simple segmentation approach that exploits these representations and outperforms the alternatives in the few-shot operating point in the context of semantic segmentation.

Dependencies

Python >= 3.7
Packages: see requirements.txt

Datasets

The evaluation is performed on 6 collected datasets with a few annotated images in the training set: Bedroom-18, FFHQ-34, Cat-15, Horse-21, CelebA-19 and ADE-Bedroom-30. The number corresponds to the number of semantic classes.

datasets.tar.gz (~47Mb)

DDPM

Pretrained DDPMs

The models trained on LSUN are adopted from guided-diffusion. FFHQ-256 is trained by ourselves using the same model parameters as for the LSUN models.

LSUN-Bedroom: lsun_bedroom.pt
FFHQ-256: ffhq.pt
LSUN-Cat: lsun_cat.pt
LSUN-Horse: lsun_horse.pt

Run

Download the datasets:
bash datasets/download_datasets.sh
Download the DDPM checkpoint:
bash checkpoints/ddpm/download_checkpoint.sh
Check paths in experiments/ /ddpm.json
Run: bash scripts/ddpm/train_interpreter.sh

Available checkpoint names: lsun_bedroom, ffhq, lsun_cat, lsun_horse
Available dataset names: bedroom_28, ffhq_34, cat_15, horse_21, celeba_19, ade_bedroom_30

How to improve the performance

Set input_activations=true in experiments/ /ddpm.json.
In this case, the feature dimension is 18432.
Tune for a particular task what diffusion steps and UNet blocks to use.

DatasetDDPM

Synthetic datasets

To download DDPM-produced synthetic datasets (50000 samples, ~7Gb):
bash synthetic-datasets/gan/download_synthetic_dataset.sh

Run | Option #1

Download the synthetic dataset:
bash synthetic-datasets/ddpm/download_synthetic_dataset.sh
Check paths in experiments/ /datasetDDPM.json
Run: bash scripts/datasetDDPM/train_deeplab.sh

Run | Option #2

Download the datasets:
bash datasets/download_datasets.sh
Download the DDPM checkpoint:
bash checkpoints/ddpm/download_checkpoint.sh
Check paths in experiments/ /datasetDDPM.json
Train an interpreter on a few DDPM-produced annotated samples:
bash scripts/datasetDDPM/train_interpreter.sh
Generate a synthetic dataset:
   bash scripts/datasetDDPM/generate_dataset.sh
    Please specify the hyperparameters in this script for the available resources.
    On 8xA100 80Gb, it takes about 12 hours to generate 10000 samples.
Run: bash scripts/datasetDDPM/train_deeplab.sh
One needs to specify the path to the generated data. See comments in the script.

Available checkpoint names: lsun_bedroom, ffhq, lsun_cat, lsun_horse
Available dataset names: bedroom_28, ffhq_34, cat_15, horse_21

SwAV

Pretrained SwAVs

We pretrain SwAV models using the official implementation on the LSUN and FFHQ-256 datasets:

LSUN-Bedroom: lsun_bedroom.pth
FFHQ-256: ffhq.pth
LSUN-Cat: lsun_cat.pth
LSUN-Horse: lsun_horse.pth

Training setup:

Dataset	epochs	batch-size	multi-crop	num-prototypes
LSUN	200	1792	2x256 + 6x108	1000
FFHQ-256	400	2048	2x224 + 6x96	200

Run

Download the datasets:
bash datasets/download_datasets.sh
Download the SwAV checkpoint:
bash checkpoints/swav/download_checkpoint.sh
Check paths in experiments/ /swav.json
Run: bash scripts/swav/train_interpreter.sh

Available checkpoint names: lsun_bedroom, ffhq, lsun_cat, lsun_horse
Available dataset names: bedroom_28, ffhq_34, cat_15, horse_21, celeba_19, ade_bedroom_30

DatasetGAN

Opposed to the official implementation, more recent StyleGAN2(-ADA) models are used.

Synthetic datasets

To download GAN-produced synthetic datasets (50000 samples):

bash synthetic-datasets/gan/download_synthetic_dataset.sh

Run

Since we almost fully adopt the official implementation, we don't provide our reimplementation here. However, one can still reproduce our results:

Download the synthetic dataset:
bash synthetic-datasets/gan/download_synthetic_dataset.sh
Change paths in experiments/ /datasetDDPM.json
Change paths and run: bash scripts/datasetDDPM/train_deeplab.sh

Available dataset names: bedroom_28, ffhq_34, cat_15, horse_21

Results

Performance in terms of mean IoU:

Method	Bedroom-28	FFHQ-34	Cat-15	Horse-21	CelebA-19	ADE-Bedroom-30
ALAE	20.0 ± 1.0	48.1 ± 1.3	--	--	49.7 ± 0.7	15.0 ± 0.5
VDVAE	--	57.3 ± 1.1	--	--	54.1 ± 1.0	--
GAN Inversion	13.9 ± 0.6	51.7 ± 0.8	21.4 ± 1.7	17.7 ± 0.4	51.5 ± 2.3	11.1 ± 0.2
GAN Encoder	22.4 ± 1.6	53.9 ± 1.3	32.0 ± 1.8	26.7 ± 0.7	53.9 ± 0.8	15.7 ± 0.3
SwAV	41.0 ± 2.3	54.7 ± 1.4	44.1 ± 2.1	51.7 ± 0.5	53.2 ± 1.0	30.3 ± 1.5
DatasetGAN	31.3 ± 2.7	57.0 ± 1.0	36.5 ± 2.3	45.4 ± 1.4	--	--
DatasetDDPM	46.9 ± 2.8	56.0 ± 0.9	45.4 ± 2.8	60.4 ± 1.2	--	--
DDPM	46.1 ± 1.9	57.0 ± 1.4	52.3 ± 3.0	63.1 ± 0.9	57.0 ± 1.0	32.3 ± 1.5

Examples of segmentation masks predicted by the DDPM-based method:

Cite

@misc{baranchuk2021labelefficient,
      title={Label-Efficient Semantic Segmentation with Diffusion Models}, 
      author={Dmitry Baranchuk and Ivan Rubachev and Andrey Voynov and Valentin Khrulkov and Artem Babenko},
      year={2021},
      eprint={2112.03126},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Related tags

Overview

Label-Efficient Semantic Segmentation with Diffusion Models

Overview

Dependencies

Datasets

DDPM

Pretrained DDPMs

Run

How to improve the performance

DatasetDDPM

Synthetic datasets

Run | Option #1

Run | Option #2

SwAV

Pretrained SwAVs

Run

DatasetGAN

Synthetic datasets

Run

Results

Cite

Owner

Yandex Research

RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

FSL-Mate: A collection of resources for few-shot learning (FSL).

Label-Free Model Evaluation with Semi-Structured Dataset Representations

Tensorflow implementation of ID-Unet: Iterative Soft and Hard Deformation for View Synthesis.

Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Code for paper PairRE: Knowledge Graph Embeddings via Paired Relation Vectors.

Method for facial emotion recognition compitition of Xunfei and Datawhale .

Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

Contains code for Deep Kernelized Dense Geometric Matching

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

SubOmiEmbed: Self-supervised Representation Learning of Multi-omics Data for Cancer Type Classification

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

Trajectory Variational Autoencder baseline for Multi-Agent Behavior challenge 2022

RoFormer_pytorch

Pytorch implementation of OCNet series and SegFix.

Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

code for "Self-supervised edge features for improved Graph Neural Network training",

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks