"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

Last update: Jan 02, 2023

Overview

SOLQ: Segmenting Objects by Learning Queries

This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

Introduction

TL; DR. SOLQ is an end-to-end instance segmentation framework with Transformer. It directly outputs the instance masks without any box dependency.

Abstract. In this paper, we propose an end-to-end framework for instance segmentation. Based on the recently introduced DETR, our method, termed SOLQ, segments objects by learning unified queries. In SOLQ, each query represents one object and has multiple representations: class, location and mask. The object queries learned perform classification, box regression and mask encoding simultaneously in an unified vector form. During training phase, the mask vectors encoded are supervised by the compression coding of raw spatial masks. In inference time, mask vectors produced can be directly transformed to spatial masks by the inverse process of compression coding. Experimental results show that SOLQ can achieve state-of-the-art performance, surpassing most of existing approaches. Moreover, the joint learning of unified query representation can greatly improve the detection performance of original DETR. We hope our SOLQ can serve as a strong baseline for the Transformer-based instance segmentation.

Main Results

Method	Backbone	Dataset	Box AP	Mask AP	Model
SOLQ	R50	test-dev	47.8	39.7	google
SOLQ	R101	test-dev	48.7	40.9	google
SOLQ	Swin-L	test-dev	55.4	45.9	google

Installation

The codebase is built on top of Deformable DETR.

Requirements

Linux, CUDA>=9.2, GCC>=5.4
Python>=3.7

We recommend you to use Anaconda to create a conda environment:
```
conda create -n deformable_detr python=3.7 pip
```
Then, activate the environment:
```
conda activate deformable_detr
```
PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)

For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:
```
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
```
Other requirements
```
pip install -r requirements.txt
```
Build MultiScaleDeformableAttention
```
cd ./models/ops
sh ./make.sh
```

Usage

Dataset preparation

Please download COCO and organize them as following:

mkdir data && cd data
ln -s /path/to/coco coco

Training and Evaluation

Training on single node

Training SOLQ on 8 GPUs as following:

sh configs/r50_solq_train.sh

Evaluation

You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 val dataset:

sh configs/r50_solq_eval.sh

Evaluation on COCO 2017 test-dev dataset

You can download the pretrained model of SOLQ (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 test-dev dataset (submit to server):

sh configs/r50_solq_submit.sh

Visualization on COCO 2017 val dataset

You can visualize on image as follows:

EXP_DIR=/path/to/checkpoint
python visual.py \
       --meta_arch solq \
       --backbone resnet50 \
       --with_vector \
       --with_box_refine \
       --masks \
       --batch_size 2 \
       --vector_hidden_dim 1024 \
       --vector_loss_coef 3 \
       --output_dir ${EXP_DIR} \
       --hidden_dim 384 \
       --resume ${EXP_DIR}/solq_r50_final.pth \
       --eval

Citing SOLQ

If you find SOLQ useful in your research, please consider citing:

@article{dong2021solq,
  title={SOLQ: Segmenting Objects by Learning Queries},
  author={Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, Yichen Wei},
  journal={arXiv preprint arXiv:2106.02351},
  year={2021}
}

"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

Related tags

Overview

SOLQ: Segmenting Objects by Learning Queries

Introduction

Main Results

Installation

Requirements

Usage

Dataset preparation

Training and Evaluation

Training on single node

Evaluation

Evaluation on COCO 2017 test-dev dataset

Visualization on COCO 2017 val dataset

Citing SOLQ

Owner

MEGVII Research

Benchmarks for semi-supervised domain generalization.

Bootstrapped Representation Learning on Graphs

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

Open-source implementation of Google Vizier for hyper parameters tuning

Underwater industrial application yolov5m6

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

An Unsupervised Graph-based Toolbox for Fraud Detection

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

Code for "Searching for Efficient Multi-Stage Vision Transformers"

商品推荐系统

"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

Related tags

Overview

SOLQ: Segmenting Objects by Learning Queries

Introduction

Main Results

Installation

Requirements

Usage

Dataset preparation

Training and Evaluation

Training on single node

Evaluation

Evaluation on COCO 2017 test-dev dataset

Visualization on COCO 2017 val dataset

Citing SOLQ

Owner

MEGVII Research

Benchmarks for semi-supervised domain generalization.

Bootstrapped Representation Learning on Graphs

THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

Open-source implementation of Google Vizier for hyper parameters tuning

Underwater industrial application yolov5m6

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

An Unsupervised Graph-based Toolbox for Fraud Detection

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

Code for "Searching for Efficient Multi-Stage Vision Transformers"

商品推荐系统

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD: