CondNet: Conditional Classifier for Scene Segmentation

Related tags

Deep LearningCondNet
Overview

CondNet: Conditional Classifier for Scene Segmentation

PWC

PWC

Introduction

The fully convolutional network (FCN) has achieved tremendous success in dense visual recognition tasks, such as scene segmentation. The last layer of FCN is typically a global classifier (1×1 convolution) to recognize each pixel to a semantic label. We empirically show that this global classifier, ignoring the intra-class distinction, may lead to sub-optimal results.

In this work, we present a conditional classifier to replace the traditional global classifier, where the kernels of the classifier are generated dynamically conditioned on the input. The main advantages of the new classifier consist of: (i) it attends on the intra-class distinction, leading to stronger dense recognition capability; (ii) the conditional classifier is simple and flexible to be integrated into almost arbitrary FCN architectures to improve the prediction. Extensive experiments demonstrate that the proposed classifier performs favourably against the traditional classifier on the FCN architecture. The framework equipped with the conditional classifier (called CondNet) achieves new state-of-the-art performances on two datasets.


Major Features

  • Simple and Flexible
  • Incorporated with almost arbitrary FCN architectures
  • Attending on the sample-specific distinction of each category

Results and Models

ADE20K

Method Backbone Crop Size Lr schd mIoU mIoU(ms+flip) config download
CondNet R-50-D8 512x512 160000 43.68 44.30 config model
CondNet R-101-D8 512x512 160000 45.64 47.12 config model

Pascal Context 59

Method Backbone Crop Size Lr schd mIoU mIoU(ms+flip) config download
CondNet R-101-D8 480x480 80000 54.29 55.74 config model

Environments

The code is developed using python 3.7 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 8 NVIDIA V100 GPU cards. Other platforms or GPU cards are not fully tested.

Quick Start

Prerequisites

  • Linux or macOS (Windows is in experimental support)
  • Python 3.6+
  • PyTorch 1.3+
  • CUDA 9.2+ (If you build PyTorch from source, CUDA 9.0 is also compatible)
  • GCC 5+
  • MMCV
  • MMSegmentation

Please refer to the guide for the information about he compatible MMSegmentation and MMCV versions. Please install the correct version of MMCV to avoid installation issues.

Note: You need to run pip uninstall mmcv first if you have mmcv installed. If mmcv and mmcv-full are both installed, there will be ModuleNotFoundError.

Installation

a. Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

b. Install PyTorch and torchvision following the official instructions. Here we use PyTorch 1.6.0 and CUDA 10.1. You may also switch to other version by specifying the version number.

conda install pytorch=1.6.0 torchvision cudatoolkit=10.1 -c pytorch

c. Install MMCV following the official instructions. Either mmcv or mmcv-full is compatible with MMSegmentation, but for methods like CCNet and PSANet, CUDA ops in mmcv-full is required.

The pre-build mmcv-full (with PyTorch 1.6 and CUDA 10.1) can be installed by running: (other available versions could be found here)

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html

Or you should download the cl compiler from web and then set up the path.

Then, clone mmcv from github and install mmcv via pip:

git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
pip install -e .

Or simply:

pip install mmcv

d. Install build requirements

pip install -r requirements.txt

Prepare datasets

It is recommended to symlink the dataset root to $CONDNET/data. If your folder structure is different, you may need to change the corresponding paths in config files.

condnet
├── models
├── tools
├── configs
├── data
│   ├── VOCdevkit
│   │   ├── VOC2012
│   │   │   ├── JPEGImages
│   │   │   ├── SegmentationClass
│   │   │   ├── ImageSets
│   │   │   │   ├── Segmentation
│   │   ├── VOC2010
│   │   │   ├── JPEGImages
│   │   │   ├── SegmentationClassContext
│   │   │   ├── ImageSets
│   │   │   │   ├── SegmentationContext
│   │   │   │   │   ├── train.txt
│   │   │   │   │   ├── val.txt
│   │   │   ├── trainval_merged.json
│   │   ├── VOCaug
│   │   │   ├── dataset
│   │   │   │   ├── cls
│   ├── ade
│   │   ├── ADEChallengeData2016
│   │   │   ├── annotations
│   │   │   │   ├── training
│   │   │   │   ├── validation
│   │   │   ├── images
│   │   │   │   ├── training
│   │   │   │   ├── validation

ADE20K

The training and validation set of ADE20K could be download from this link. We may also download test set from here.

Pascal Context

The training and validation set of Pascal Context could be download from here. You may also download test set from here after registration.

To split the training and validation set from original dataset, you may download trainval_merged.json from here.

If you would like to use Pascal Context dataset, please install Detail and then run the following command to convert annotations into proper format.

python tools/convert_datasets/pascal_context.py data/VOCdevkit data/VOCdevkit/VOC2010/trainval_merged.json

More datasets please refer to MMSegmentation.

Training and Testing

All outputs (log files and checkpoints) will be saved to the working directory, which is specified by work_dir in the config file.

By default we evaluate the model on the validation set after some iterations, you can change the evaluation interval by adding the interval argument in the training config.

evaluation = dict(interval=4000)  # This evaluate the model per 4000 iterations.

*Important*: The default learning rate in config files is for 4 GPUs and 2 img/gpu (batch size = 4x2 = 8). Equivalently, you may also use 8 GPUs and 1 imgs/gpu since all models using cross-GPU SyncBN.

To trade speed with GPU memory, you may pass in --options model.backbone.with_cp=True to enable checkpoint in backbone.

Training

Train with a single GPU

python tools/train.py ${CONFIG_FILE} [optional arguments]

If you want to specify the working directory in the command, you can add an argument --work-dir ${YOUR_WORK_DIR}.

Train with multiple GPUs

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

Optional arguments are:

  • --no-validate (not suggested): By default, the codebase will perform evaluation at every k iterations during the training. To disable this behavior, use --no-validate.
  • --work-dir ${WORK_DIR}: Override the working directory specified in the config file.
  • --resume-from ${CHECKPOINT_FILE}: Resume from a previous checkpoint file (to continue the training process).
  • --load-from ${CHECKPOINT_FILE}: Load weights from a checkpoint file (to start finetuning for another task).

Difference between resume-from and load-from:

  • resume-from loads both the model weights and optimizer state including the iteration number.
  • load-from loads only the model weights, starts the training from iteration 0.

Launch multiple jobs on a single machine

If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs, you need to specify different ports (29500 by default) for each job to avoid communication conflict. Otherwise, there will be error message saying RuntimeError: Address already in use.

If you use dist_train.sh to launch training jobs, you can set the port in commands with environment variable PORT.

CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4

If you use slurm_train.sh to launch training jobs, you can set the port in commands with environment variable MASTER_PORT.

MASTER_PORT=29500 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE}
MASTER_PORT=29501 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE}

Testing

  • single GPU
  • single node multiple GPU

You can use the following commands to test a dataset.

# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]

# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

Optional arguments:

  • RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file. (After mmseg v0.17, the output results become pre-evaluation results or format result paths)
  • EVAL_METRICS: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., mIoU is available for all dataset. Cityscapes could be evaluated by cityscapes as well as standard mIoU metrics.
  • --show: If specified, segmentation results will be plotted on the images and shown in a new window. It is only applicable to single GPU testing and used for debugging and visualization. Please make sure that GUI is available in your environment, otherwise you may encounter the error like cannot connect to X server.
  • --show-dir: If specified, segmentation results will be plotted on the images and saved to the specified directory. It is only applicable to single GPU testing and used for debugging and visualization. You do NOT need a GUI available in your environment for using this option.
  • --eval-options: Optional parameters for dataset.format_results and dataset.evaluate during evaluation. When efficient_test=True, it will save intermediate results to local files to save CPU memory. Make sure that you have enough local storage space (more than 20GB). (efficient_test argument does not have effect after mmseg v0.17, we use a progressive mode to evaluation and format results which can largely save memory cost and evaluation time.)

Examples:

Assume that you have already downloaded the checkpoints to the directory checkpoints/.

Test CondNet with 4 GPUs, and evaluate the standard mIoU metric.

```shell
./tools/dist_test.sh configs/condnet/condnet_r101-d8_512x512_160k_ade20k.py \
    checkpoints/condnet_r101-d8_512x512_160k_ade20k.pth \
    4 --out results.pkl --eval mIoU
```

Citation

If you find this project useful in your research, please consider cite:

@ARTICLE{Yucondnet21,
  author={Yu, Changqian and Shao, Yuanjie and Gao, Changxin and Sang, Nong},
  journal={IEEE Signal Processing Letters}, 
  title={CondNet: Conditional Classifier for Scene Segmentation}, 
  year={2021},
  volume={28},
  number={},
  pages={758-762},
  doi={10.1109/LSP.2021.3070472}}

Acknowledgement

Thanks to:

Owner
ycszen
ycszen
LowRankModels.jl is a julia package for modeling and fitting generalized low rank models.

LowRankModels.jl LowRankModels.jl is a Julia package for modeling and fitting generalized low rank models (GLRMs). GLRMs model a data array by a low r

Madeleine Udell 183 Dec 17, 2022
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Tensorpack is a neural network training interface based on TensorFlow. Features: It's Yet Another TF high-level API, with speed, and flexibility built

Tensorpack 6.2k Jan 09, 2023
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Deep Learning - All You Need to Know Sponsorship To support maintaining and upgrading this project, please kindly consider Sponsoring the project deve

Instill AI 4.4k Dec 26, 2022
Veri Setinizi Yolov5 Formatına Dönüştürün

Veri Setinizi Yolov5 Formatına Dönüştürün! Bu Repo da Neler Var? Xml Formatındaki Veri Setini .Txt Formatına Çevirme Xml Formatındaki Dosyaları Silme

Kadir Nar 4 Aug 22, 2022
A unet implementation for Image semantic segmentation

Unet-pytorch a unet implementation for Image semantic segmentation 参考网上的Unet做分割的代码,做了一个针对kaggle地盐识别的,请去以下地址获取数据集: https://www.kaggle.com/c/tgs-salt-id

Rabbit 3 Jun 29, 2022
MacroTools provides a library of tools for working with Julia code and expressions.

MacroTools.jl MacroTools provides a library of tools for working with Julia code and expressions. This includes a powerful template-matching system an

FluxML 278 Dec 11, 2022
Locationinfo - A script helps the user to show network information such as ip address

Description This script helps the user to show network information such as ip ad

Roxcoder 1 Dec 30, 2021
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
NeRF Meta-Learning with PyTorch

NeRF Meta Learning With PyTorch nerf-meta is a PyTorch re-implementation of NeRF experiments from the paper "Learned Initializations for Optimizing Co

Sanowar Raihan 78 Dec 18, 2022
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 08, 2022
Deep learning algorithms for muon momentum estimation in the CMS Trigger System

Deep learning algorithms for muon momentum estimation in the CMS Trigger System The Compact Muon Solenoid (CMS) is a general-purpose detector at the L

anuragB 2 Oct 06, 2021
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 08, 2022
Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

Lens by Credo AI - Responsible AI Assessment Framework Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data a

Credo AI 27 Dec 14, 2022
Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

Does-MAML-Only-Work-via-Feature-Re-use-A-Data-Set-Centric-Perspective Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective Installin

2 Nov 07, 2022
The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

QUT Centre for Robotics (QCR) 13 Nov 26, 2022
PyElastica is the Python implementation of Elastica, an open-source software for the simulation of assemblies of slender, one-dimensional structures using Cosserat Rod theory.

PyElastica PyElastica is the python implementation of Elastica: an open-source project for simulating assemblies of slender, one-dimensional structure

Gazzola Lab 105 Jan 09, 2023
Shitty gaze mouse controller

demo.mp4 shitty_gaze_mouse_cotroller install tensofflow, cv2 run the main.py and as it starts it will collect data so first raise your left eyebrow(bo

16 Aug 30, 2022
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Implementation of equivariant GVP-GNNs as described in Learning from Protein Structure with Geometric Vector Perceptrons b

Dror Lab 142 Dec 29, 2022
RNN Predict Street Commercial Vitality

RNN-for-Predicting-Street-Vitality Code and dataset for Predicting the Vitality of Stores along the Street based on Business Type Sequence via Recurre

Zidong LIU 1 Dec 15, 2021