Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Last update: Jan 02, 2023

Overview

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation,
Wenguan Wang, Tianfei Zhou, Fisher Yu, Jifeng Dai, Ender Konukoglu and Luc Van Gool
arXiv technical report (arXiv 2101.11939)

Abstract

Current semantic segmentation methods focus only on mining “local” context, i.e., dependencies between pixels within individual images, by context-aggregation modules (e.g., dilated convolution, neural attention) or structureaware optimization criteria (e.g., IoU-like loss). However, they ignore “global” context of the training data, i.e., rich semantic relations between pixels across different images. Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting. The core idea is to enforce pixel embeddings belonging to a same semantic class to be more similar than embeddings from different classes. It raises a pixel-wise metric learning paradigm for semantic segmentation, by explicitly exploring the structures of labeled pixels, which are long ignored in the field. Our method can be effortlessly incorporated into existing segmentation frameworks without extra overhead during testing.

We experimentally show that, with famous segmentation models (i.e., DeepLabV3, HRNet, OCR) and backbones (i.e., ResNet, HRNet), our method brings consistent performance improvements across diverse datasets (i.e., Cityscapes, PASCALContext, COCO-Stuff).

Installation

This implementation is built on openseg.pytorch. Many thanks to the authors for the efforts.

Please follow the Getting Started for installation and dataset preparation.

Running

Cityscapes

Train DeepLabV3

bash scripts/cityscapes/deeplab/run_r_101_d_8_deeplabv3_train_contrast.sh train 'resnet101-deeplabv3-contrast'

Features (in progress)

t-SNE Visualization

Pixel-wise Cross-Entropy Loss

Pixel-wise Contrastive Learning Objective

Citation

@article{wang2021exploring,
  title   = {Exploring Cross-Image Pixel Contrast for Semantic Segmentation},
  author  = {Wang, Wenguan and Zhou, Tianfei and Yu, Fisher and Dai, Jifeng and Konukoglu, Ender and Van Gool, Luc},
  journal = {arXiv preprint arXiv:2101.11939},
  year    = {2021}
}

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Related tags

Overview

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Abstract

Installation

Running

Cityscapes

Features (in progress)

t-SNE Visualization

Citation

Owner

Tianfei Zhou

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Back to Basics: Efficient Network Compression via IMP

v objective diffusion inference code for PyTorch.

OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

PyTorch Implementation of SSTNs for hyperspectral image classifications from the IEEE T-GRS paper "Spectral-Spatial Transformer Network for Hyperspectral Image Classification: A FAS Framework."

Image Super-Resolution by Neural Texture Transfer

GANsformer: Generative Adversarial Transformers Drew A

OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Official implementation of "MetaSDF: Meta-learning Signed Distance Functions"

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

torchbearer: A model fitting library for PyTorch

Single-Stage 6D Object Pose Estimation, CVPR 2020

Semantic segmentation models, datasets and losses implemented in PyTorch.

How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Python based framework for Automatic AI for Regression and Classification over numerical data.

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

The missing CMake project initializer