BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Last update: Dec 12, 2022

Related tags

Overview

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

As of Apr. 17th, 2021, 1^st place in KITTI BEV detection leaderboard and on par performance on KITTI 3D detection leaderboard. The detector can run at 7.1 FPS.

Authors: Rui Qian, Xin Lai, Xirong Li

[arXiv] [elsevier]

Citation

If you find this code useful in your research, please consider citing our work:

@InProceedings{qian2022pr,
author = {Rui Qian and Xin Lai and Xirong Li},
title = {BADet: Boundary-Aware 3D Object Detection from Point Clouds},
booktitle = {Pattern Recognition (PR)},
month = {January},
year = {2022}
}
@misc{qian20213d,
title={3D Object Detection for Autonomous Driving: A Survey}, 
author={Rui Qian and Xin Lai and Xirong Li},
year={2021},
eprint={2106.10823},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Updates

2021-03-17: The performance (using 40 recall poisitions) on test set is as follows:

Car [email protected], 0.70, 0.70:
bbox AP:98.75, 95.61, 90.64
bev  AP:95.23, 91.32, 86.48 
3d   AP:89.28, 81.61, 76.58 
aos  AP:98.65, 95.34, 90.28

Introduction

Currently, existing state-of-the-art 3D object detectors are in two-stage paradigm. These methods typically comprise two steps: 1) Utilize a region proposal network to propose a handful of high-quality proposals in a bottom-up fashion. 2) Resize and pool the semantic features from the proposed regions to summarize RoI-wise representations for further refinement. Note that these RoI-wise representations in step 2) are considered individually as uncorrelated entries when fed to following detection headers. Nevertheless, we observe these proposals generated by step 1) offset from ground truth somehow, emerging in local neighborhood densely with an underlying probability. Challenges arise in the case where a proposal largely forsakes its boundary information due to coordinate offset while existing networks lack corresponding information compensation mechanism. In this paper, we propose $BADet$ for 3D object detection from point clouds. Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited. Besides, we devise a lightweight Region Feature Aggregation Module to fully exploit voxel-wise, pixel-wise, and point-wise features with expanding receptive fields for more informative RoI-wise representations. We validate BADet both on widely used KITTI Dataset and highly challenging nuScenes Dataset. As of Apr. 17th, 2021, our BADet achieves on par performance on KITTI 3D detection leaderboard and ranks $1^{st}$ on $Moderate$ difficulty of $Car$ category on KITTI BEV detection leaderboard. The source code is available at https://github.com/rui-qian/BADet.

Dependencies

python3.5+
pytorch (tested on 1.1.0)
opencv
shapely
mayavi
spconv (v1.0)

Installation

Clone this repository.
Compile C++/CUDA modules in mmdet/ops by running the following command at each directory, e.g.

$ cd mmdet/ops/points_op
$ python3 setup.py build_ext --inplace

Setup following Environment variables, you may add them to ~/.bashrc:

export NUMBAPRO_CUDA_DRIVER=/usr/lib/x86_64-linux-gnu/libcuda.so
export NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice
export LD_LIBRARY_PATH=/home/qianrui/anaconda3/lib/python3.7/site-packages/spconv;

Data Preparation

Download the 3D KITTI detection dataset from here. Data to download include:
- Velodyne point clouds (29 GB): input data to VoxelNet
- Training labels of object data set (5 MB): input label to VoxelNet
- Camera calibration matrices of object data set (16 MB): for visualization of predictions
- Left color images of object data set (12 GB): for visualization of predictions
Create cropped point cloud and sample pool for data augmentation, please refer to SECOND.
Split the training set into training and validation set according to the protocol here.
You could run the following command to prepare Data:

$ python3 tools/create_data.py

[email protected]:~/qianrui/kitti$ tree -L 1
data_root = '/home/qr/qianrui/kitti/'
├── gt_database
├── ImageSets
├── kitti_dbinfos_train.pkl
├── kitti_dbinfos_trainval.pkl
├── kitti_infos_test.pkl
├── kitti_infos_train.pkl
├── kitti_infos_trainval.pkl
├── kitti_infos_val.pkl
├── train.txt
├── trainval.txt
├── val.txt
├── test.txt
├── training   <-- training data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced
└── testing  <--- testing data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced

Pretrained Model

You can download the pretrained model [Model][Archive], which is trained on the train split (3712 samples) and evaluated on the val split (3769 samples) and test split (7518 samples). The performance (using 11 recall poisitions) on validation set is as follows:

[40, 1600, 1408]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3769/3769, 7.1 task/s, elapsed: 533s, ETA:     0s
Car [email protected], 0.70, 0.70:
bbox AP:98.27, 90.22, 89.66
bev  AP:90.59, 88.85, 88.09
3d   AP:90.06, 85.75, 78.98
aos  AP:98.18, 89.98, 89.25
Car [email protected], 0.50, 0.50:
bbox AP:98.27, 90.22, 89.66
bev  AP:98.31, 90.21, 89.73
3d   AP:98.20, 90.11, 89.61
aos  AP:98.18, 89.98, 89.25

Quick demo

You could run the following command to evaluate the pretrained model:

cd mmdet/tools
# vim ../configs/car_cfg.py(modify score_thr=0.4, score_thr=0.3 for val split and test split respectively.)
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/epoch_50.pth

Model	Archive	Parameters	Moderate(Car)	Pretrained Model	Predicts
BADet(val)	[Link]	44.2 MB	86.21%	[icloud drive]	[Results]
BADet(test)	[Link]	44.2 MB	81.61%	[icloud drive]	[Results]

Training

To train the BADet with single GPU, run the following command:

cd mmdet/tools
python3 train.py ../configs/car_cfg.py

Inference

To evaluate the model, run the following command:

cd mmdet/tools
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/latest.pth

Acknowledgement

The code is devloped based on mmdetection, some part of codes are borrowed from SA-SSD, SECOND, and PointRCNN.

Contact

If you have questions, you can contact [email protected].

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Related tags

Overview

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Citation

Updates

Introduction

Dependencies

Installation

Data Preparation

Pretrained Model

Quick demo

Training

Inference

Acknowledgement

Contact

Owner

Rui Qian

A concise but complete implementation of CLIP with various experimental improvements from recent papers

Model Quantization Benchmark

An imperfect information game is a type of game with asymmetric information

SimBERT升级版（SimBERTv2）！

This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队

Disagreement-Regularized Imitation Learning

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

alfred-py: A deep learning utility library for human

PyTorch implementation of DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images

A curated (most recent) list of resources for Learning with Noisy Labels

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Related tags

Overview

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Citation

Updates

Introduction

Dependencies

Installation

Data Preparation

Pretrained Model

Quick demo

Training

Inference

Acknowledgement

Contact

Owner

Rui Qian

A concise but complete implementation of CLIP with various experimental improvements from recent papers

Model Quantization Benchmark

An imperfect information game is a type of game with asymmetric information

SimBERT升级版（SimBERTv2）！

This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

2021搜狐校园文本匹配算法大赛 分比我们低的都是帅哥队

Disagreement-Regularized Imitation Learning

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

alfred-py: A deep learning utility library for **human**

PyTorch implementation of DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images

A curated (most recent) list of resources for Learning with Noisy Labels

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队

alfred-py: A deep learning utility library for human