RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Last update: Jan 09, 2023

Related tags

Overview

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

This repository contains the source code for our paper:

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
Lahav Lipson, Zachary Teed and Jia Deng

@article{lipson2021raft,
  title={{RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching}},
  author={Lipson, Lahav and Teed, Zachary and Deng, Jia},
  journal={arXiv preprint arXiv:2109.07547},
  year={2021}
}

Requirements

The code has been tested with PyTorch 1.7 and Cuda 10.2.

conda env create -f environment.yaml
conda activate raftstereo

Required Data

To evaluate/train RAFT-stereo, you will need to download the required datasets.

Sceneflow (Includes FlyingThings3D, Driving & Monkaa
Middlebury
ETH3D
KITTI

To download the ETH3D and Middlebury test datasets for the demos, run

chmod ug+x download_datasets.sh && ./download_datasets.sh

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── testing
        ├── training
        ├── devkit
    ├── Middlebury
        ├── MiddEval3
    ├── ETH3D
        ├── lakeside_1l
        ├── ...
        ├── tunnel_3s

Demos

Pretrained models can be downloaded by running

chmod ug+x download_models.sh && ./download_models.sh

or downloaded from google drive

You can demo a trained model on pairs of images. To predict stereo for Middlebury, run

python demo.py --restore_ckpt models/raftstereo-sceneflow.pth

Or for ETH3D:

python demo.py --restore_ckpt models/raftstereo-eth3d.pth -l=datasets/ETH3D/*/im0.png -r=datasets/ETH3D/*/im1.png

Using our fastest model:

python demo.py --restore_ckpt models/raftstereo-realtime.pth  --shared_backbone --n_downsample 3 --n_gru_layers 2 --slow_fast_gru

To save the disparity values as .npy files, run any of the demos with the --save_numpy flag.

Converting Disparity to Depth

If the camera focal length and camera baseline are known, disparity predictions can be converted to depth values using

Note that the units of the focal length are pixels not millimeters.

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury), run

python evaluate_stereo.py --restore_ckpt models/raftstereo-middlebury.pth --dataset middlebury_H

Training

Our model is trained on two RTX-6000 GPUs using the following command. Training logs will be written to runs/ which can be visualized using tensorboard.

python train_stereo.py --batch_size 8 --train_iters 22 --valid_iters 32 --spatial_scale -0.2 0.4 --saturation_range 0 1.4 --n_downsample 2 --num_steps 200000 --mixed_precision

To train using significantly less memory, change --n_downsample 2 to --n_downsample 3. This will slightly reduce accuracy.

(Optional) Faster Implementation

We provide a faster CUDA implementation of the correlation volume which works with mixed precision feature maps.

cd sampler && python setup.py install && cd ..

Running demo.py, train_stereo.py or evaluate.py with --corr_implementation reg_cuda together with --mixed_precision will speed up the model without impacting performance.

To significantly decrease memory consumption on high resolution images, use --corr_implementation alt. This implementation is slower than the default, however.

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Related tags

Overview

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Requirements

Required Data

Demos

Converting Disparity to Depth

Evaluation

Training

(Optional) Faster Implementation

Owner

Princeton Vision & Learning Lab

ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset

Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

Deep Ensemble Learning with Jet-Like architecture

A programming language written with python

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

Learning Skeletal Articulations with Neural Blend Shapes

Galileo library for large scale graph training by JD

PCGNN - Procedural Content Generation with NEAT and Novelty

Depth image based mouse cursor visual haptic

Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator

Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, numpy and joblib packages.

Official repository of DeMFI (arXiv.)

Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion