Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Last update: Dec 26, 2022

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

CVPR 2021 (Oral)

Existing segmentation methods are mostly unidirectional, i.e. utilizing 3D for 2D segmentation or vice versa. Obviously 2D and 3D information can nicely complement each other in both directions, during the segmentation. This is the goal of bidirectional projection network.

Environment

Main

# Torch
$ pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
# MinkowskiEngine 0.4.1
$ conda install numpy openblas
$ git clone https://github.com/StanfordVL/MinkowskiEngine.git
$ cd MinkowskiEngine
$ git checkout f1a419cc5792562a06df9e1da686b7ce8f3bb5ad
$ python setup.py install
# Others
$ pip install imageio==2.8.0 opencv-python==4.2.0.32 pillow==7.0.0 pyyaml==5.3 scipy==1.4.1 sharedarray==3.2.0 tensorboardx==2.0 tqdm==4.42.1

Others

Please refer to env.yml for details.

Prepare data

Download the dataset from official website.
2D: The scripts is from 3DMV repo, it is based on python2, other code in this repo is based on python3 python prepare_2d_data.py --scannet_path data/scannetv2 --output_path data/scannetv2_images --export_label_images
3D: dataset/preprocess_3d_scannet.py

Config

BPNet_5cm: config/scannet/bpnet_5cm.yaml

Training

Download pretrained 2D ResNets on ImageNet from PyTorch website, and put them into the initmodel folder.

model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}

Start training: sh tool/train.sh EXP_NAME /PATH/TO/CONFIG NUMBER_OF_THREADS
Resume: sh tool/resume.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS

NUMBER_OF_THREADS is the threads to use per process (gpu), so optimally, it should be Total_threads / gpu_number_used

Testing

Testing using your trained model or our pre-trained model (voxel_size: 5cm): sh tool/test.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS)

Copyright and License

You are granted with the LICENSE for both academic and commercial usages.

Acknowledgment

Our code is based on MinkowskiEngine. We also referred to SparseConvNet and semseg.

Citation

@inproceedings{hu-2021-bidirectional,
        author      = {Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong},
        title       = {Bidirectional Projection Network for Cross Dimensional Scene Understanding},
        booktitle   = {CVPR},
        year        = {2021}
    }

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Related tags

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

Environment

Prepare data

Config

Training

Testing

Copyright and License

Acknowledgment

Citation

Owner

Hu Wenbo

SMD-Nets: Stereo Mixture Density Networks

TDN: Temporal Difference Networks for Efficient Action Recognition

Deep Learning Package based on TensorFlow

Resources for the Ki testnet challenge

A tight inclusion function for continuous collision detection

PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability

Official Pytorch Implementation for Splicing ViT Features for Semantic Appearance Transfer presenting Splice

This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

All materials of Cassandra Event, Udyam'22

Text to image synthesis using thought vectors

Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Generate images from texts. In Russian. In PaddlePaddle

Hcaptcha-challenger - Gracefully face hCaptcha challenge with Yolov5(ONNX) embedded solution

🚗 INGI Dakar 2K21 - Be the first one on the finish line ! 🚗

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation

Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility ICCV2021

A small fun project using python OpenCV, mediapipe, and pydirectinput

Out-of-distribution detection using the pNML regret. NeurIPS2021