Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

Related tags

Deep LearningSSTNet
Overview

SSTNet

PWC PWC

overview Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks(ICCV2021) by Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia*. (*) Corresponding author. [arxiv]

Introduction

Instance segmentation in 3D scenes is fundamental in many applications of scene understanding. It is yet challenging due to the compound factors of data irregularity and uncertainty in the numbers of instances. State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances. While promising, they have the shortcomings that (1) the second step is not supervised by the main objective of instance segmentation, and (2) their point-wise feature learning and grouping are less effective to deal with data irregularities, possibly resulting in fragmented segmentations. To address these issues, we propose in this work an end-to-end solution of Semantic Superpoint Tree Network (SSTNet) for proposing object instances from scene points. Key in SSTNet is an intermediate, semantic superpoint tree (SST), which is constructed based on the learned semantic features of superpoints, and which will be traversed and split at intermediate tree nodes for proposals of object instances. We also design in SSTNet a refinement module, termed CliqueNet, to prune superpoints that may be wrongly grouped into instance proposals.

Installation

Requirements

  • Python 3.8.5
  • Pytorch 1.7.1
  • torchvision 0.8.2
  • CUDA 11.1

then install the requirements:

pip install -r requirements.txt

SparseConv

For the SparseConv, please refer PointGroup's spconv to install.

Extension

This project is based on our Gorilla-Lab deep learning toolkit - gorilla-core and 3D toolkit gorilla-3d.

For gorilla-core, you can install it by running:

pip install gorilla-core==0.2.7.6

or building from source(recommend)

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-core
cd gorilla-core
python setup.py install(develop)

For gorilla-3d, you should install it by building from source:

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-3d
cd gorilla-3d
python setup.py develop

Tip: for high-version torch, the BuildExtension may fail by using ninja to build the compile system. If you meet this problem, you can change the BuildExtension in cmdclass={"build_ext": BuildExtension} as cmdclass={"build_ext": BuildExtension}.with_options(use_ninja=False)

Otherwise, this project also need other extension, we use the pointgroup_ops to realize voxelization and use the segmentator to generate superpoints for scannet scene. we use the htree to construct the Semantic Superpoint Tree and the hierarchical node-inheriting relations is realized based on the modified cluster.hierarchy.linkage function from scipy.

  • For pointgroup_ops, we modified the package from PointGroup to let its function calls get rid of the dependence on absolute paths. You can install it by running:
    conda install -c bioconda google-sparsehash 
    cd $PROJECT_ROOT$
    cd sstnet/lib/pointgroup_ops
    python setup.py develop
    Then, you can call the function like:
    import pointgroup_ops
    pointgroup_ops.voxelization
    >>> <function Voxelization.apply>
  • For htree, it can be seen as a supplement to the treelib python package, and I abstract the SST through both of them. You can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/htree
    python setup.py install

    Tip: The interaction between this piece of code and treelib is a bit messy. I lack time to organize it, which may cause some difficulties for someone in understanding. I am sorry for this. At the same time, I also welcome people to improve it.

  • For cluster, it is originally a sub-module in scipy, the SST construction requires the cluster.hierarchy.linkage to be implemented. However, the origin implementation do not consider the sizes of clustering nodes (each superpoint contains different number of points). To this end, we modify this function and let it support the property mentioned above. So, for used, you can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/cluster
    python setup.py install
  • For segmentator, please refer here to install. (We wrap the segmentator in ScanNet)

Data Preparation

Please refer to the README.md in data/scannetv2 to realize data preparation.

Training

CUDA_VISIBLE_DEVICES=0 python train.py --config config/default.yaml

You can start a tensorboard session by

tensorboard --logdir=./log --port=6666

Tip: For the directory of logging, please refer the implementation of function gorilla.collect_logger.

Inference and Evaluation

CUDA_VISIBLE_DEVICES=0 python test.py --config config/default.yaml --pretrain pretrain.pth --eval
  • --split is the evaluation split of dataset.
  • --save is the action to save instance segmentation results.
  • --eval is the action to evaluate the segmentation results.
  • --semantic is the action to evaluate semantic segmentation only (work on the --eval mode).
  • --log-file is to define the logging file to save evaluation result (default please to refer the gorilla.collect_logger).
  • --visual is the action to save visualization of instance segmentation. (It will be mentioned in the next partion.)

Results on ScanNet Benchmark

Rank 1st on the ScanNet benchmark benchmark

Pretrained

We provide a pretrained model trained on ScanNet(v2) dataset. [Google Drive] [Baidu Cloud] (提取码:f3az) Its performance on ScanNet(v2) validation set is 49.4/64.9/74.4 in terms of mAP/mAP50/mAP25.

Acknowledgement

This repo is built upon several repos, e.g., PointGroup, spconv and ScanNet.

Contact

If you have any questions or suggestions about this repo or paper, please feel free to contact me in issue or email ([email protected]).

TODO

  • Distributed training(not verification)
  • Batch inference
  • Multi-processing for getting superpoints

Citation

If you find this work useful in your research, please cite:

@misc{liang2021instance,
      title={Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks}, 
      author={Zhihao Liang and Zhihao Li and Songcen Xu and Mingkui Tan and Kui Jia},
      year={2021},
      eprint={2108.07478},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Research lab focusing on CV, ML, and AI
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
最新版本yolov5+deepsort目标检测和追踪,支持5.0版本可训练自己数据集

使用YOLOv5+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中。

422 Dec 30, 2022
gtfs2vec - Learning GTFS Embeddings for comparing PublicTransport Offer in Microregions

gtfs2vec This is a companion repository for a gtfs2vec - Learning GTFS Embeddings for comparing PublicTransport Offer in Microregions publication. Vis

Politechnika Wrocławska - repozytorium dla informatyków 5 Oct 10, 2022
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

Jiahui Zhang 225 Dec 05, 2022
A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

MADGRAD Optimization Algorithm For Tensorflow This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized,

20 Aug 18, 2022
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision

HugsVision is an open-source and easy to use all-in-one huggingface wrapper for computer vision. The goal is to create a fast, flexible and user-frien

Labrak Yanis 166 Nov 27, 2022
True Few-Shot Learning with Language Models

This codebase supports using language models (LMs) for true few-shot learning: learning to perform a task using a limited number of examples from a single task distribution.

Ethan Perez 124 Jan 04, 2023
AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning

AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning AutoPentest-DRL is an automated penetration testing framework based o

Cyber Range Organization and Design Chair 217 Jan 01, 2023
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Introduction QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and

Yu 1.4k Dec 30, 2022
学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Code Transformer This is an official PyTorch implementation of the CodeTransformer model proposed in: D. Zügner, T. Kirschstein, M. Catasta, J. Leskov

Daniel Zügner 131 Dec 13, 2022
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

SimMIM By Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai and Han Hu*. This repo is the official implementation of

Microsoft 674 Dec 26, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Jan 05, 2023
Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

LapDepth-release This repository is a Pytorch implementation of the paper "Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals" M

Minsoo Song 205 Dec 30, 2022
Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Receptive Field Block Net for Accurate and Fast Object Detection By Songtao Liu, Di Huang, Yunhong Wang Updatas (2021/07/23): YOLOX is here!, stronger

Liu Songtao 1.4k Dec 21, 2022
Source code for "Interactive All-Hex Meshing via Cuboid Decomposition [SIGGRAPH Asia 2021]".

Interactive All-Hex Meshing via Cuboid Decomposition Video demonstration This repository contains an interactive software to the PolyCube-based hex-me

Lingxiao Li 131 Dec 05, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
Code release for Local Light Field Fusion at SIGGRAPH 2019

Local Light Field Fusion Project | Video | Paper Tensorflow implementation for novel view synthesis from sparse input images. Local Light Field Fusion

1.1k Dec 27, 2022
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 21 Nov 23, 2022
Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Find-Lane-Line This project is to use openCV library and Python to detect the road-lane-line. Data Pipeline Step one : Color Selection Step two : Cann

Kenny Cheng 3 Aug 17, 2022