Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Last update: Jan 02, 2023

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection, CVPR 2021

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

[project], [supp], [slides], [1min_talk], demo, arxiv

This code is based on Kinematic-3D, such that the setup/organization is very similar. A few of the implementations, such as classical NMS, are based on Caffe.

References

Please cite the following paper if you find this repository useful:

@inproceedings{kumar2021groomed,
  title={{GrooMeD-NMS}: Grouped Mathematically Differentiable NMS for Monocular {$3$D} Object Detection},
  author={Kumar, Abhinav and Brazil, Garrick and Liu, Xiaoming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Setup

Requirements
1. Python 3.6
2. Pytorch 0.4.1
3. Torchvision 0.2.1
4. Cuda 8.0
5. Ubuntu 18.04/Debian 8.9
This is tested with NVIDIA 1080 Ti GPU. Other platforms have not been tested. Unless otherwise stated, the below scripts and instructions assume the working directory is the project root.

Clone the repo first:
```
git clone https://github.com/abhi1kumar/groomed_nms.git
```

Cuda & Python

Install some basic packages:

sudo apt-get install libopenblas-dev libboost-dev libboost-all-dev git
sudo apt install gfortran

# We need to compile with older version of gcc and g++
sudo apt install gcc-5 g++-5
sudo ln -f /usr/bin/gcc-5 /usr/local/cuda-8.0/bin/gcc
sudo ln -s /usr/bin/g++-5 /usr/local/cuda-8.0/bin/g++

Next, install conda and then install the required packages:

wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
bash Anaconda3-2020.02-Linux-x86_64.sh
source ~/.bashrc
conda list
conda create --name py36 --file dependencies/conda.txt
conda activate py36

KITTI Data

Download the following images of the full KITTI 3D Object detection dataset:

left color images of object data set (12 GB)
camera calibration matrices of object data set (16 MB)
training labels of object data set (5 MB)

Then place a soft-link (or the actual data) in data/kitti:

 ln -s /path/to/kitti data/kitti

The directory structure should look like this:

./groomed_nms
|--- cuda_env
|--- data
|      |---kitti
|            |---training
|            |        |---calib
|            |        |---image_2
|            |        |---label_2
|            |
|            |---testing
|                     |---calib
|                     |---image_2
|
|--- dependencies
|--- lib
|--- models
|--- scripts

Then, use the following scripts to extract the data splits, which use soft-links to the above directory for efficient storage:

python data/kitti_split1/setup_split.py
python data/kitti_split2/setup_split.py

Next, build the KITTI devkit eval:

 sh data/kitti_split1/devkit/cpp/build.sh

Classical NMS

Lastly, build the classical NMS modules:
```
cd lib/nms
make
cd ../..
```

Training

Training is carried out in two stages - a warmup and a full. Review the configurations in scripts/config for details.

chmod +x scripts_training.sh
./scripts_training.sh

If your training is accidentally stopped, you can resume at a checkpoint based on the snapshot with the restore flag. For example, to resume training starting at iteration 10k, use the following command:

source dependencies/cuda_8.0_env
CUDA_VISIBLE_DEVICES=0 python -u scripts/train_rpn_3d.py --config=groumd_nms --restore=10000

Testing

We provide logs/models/predictions for the main experiments on KITTI Val 1/Val 2/Test data splits available to download here.

Make an output folder in the project directory:

mkdir output

Place different models in the output folder as follows:

./groomed_nms
|--- output
|      |---groumd_nms
|      |
|      |---groumd_nms_split2
|      |
|      |---groumd_nms_full_train_2
|
| ...

To test, run the file as below:

chmod +x scripts_evaluation.sh
./scripts_evaluation.sh

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

169 Dec 6, 2022

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

Comments

Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

Hi~thanks for your great work. However, I have some confusion in understanding the motivation of this algorithm. If we want to achieve the consistency of training and test, we can simply penalize the highest-confidence proposal in the training pipeline, which seems to achieve similar result.So, is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

opened by kaixinbear 3
Problem in test

Hi, this is an exciting work.And i have a question when I try to test with the pre-train model. I can't find "Kinematic3D-Release/val1_kinematic/model_final".

opened by chenH20000109 1

Releases(v0.1)

v0.1(Mar 30, 2021)

First Release of GrooMeD-NMS
Source code(tar.gz)
Source code(zip)

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Related tags

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

References

Setup

Training

Testing

Contact

You might also like...

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Categorical Depth Distribution Network for Monocular 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection

Comments

Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

Problem in test

Releases(v0.1)

v0.1(Mar 30, 2021)

Owner

Abhinav Kumar

Predicting Student Attentiveness using OpenCV

Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

ImageNet Adversarial Image Evaluation

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

On the Adversarial Robustness of Visual Transformer

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

The most simple and minimalistic navigation dashboard.

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

Person Re-identification

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

This code finds bounding box of a single human mouth.

A Python Package For System Identification Using NARMAX Models

Simple node deletion tool for onnx.

Migration of Edge-based Distributed Federated Learning

Banglore House Prediction Using Flask Server (Python)

A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.