CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Related tags

Deep Learningcorenet
Overview

CoReNet

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image. It produces coherent reconstructions, where all objects live in a single consistent 3D coordinate frame relative to the camera, and they do not intersect in 3D. You can find more information in the following paper: CoReNet: Coherent 3D scene reconstruction from a single RGB image.

This repository contains source code, dataset pointers, and instructions for reproducing the results in the paper. If you find our code, data, or the paper useful, please consider citing

@InProceedings{popov20eccv,
  title="CoReNet: Coherent 3D Scene Reconstruction from a Single RGB Image",
  author="Popov, Stefan and Bauszat, Pablo and Ferrari, Vittorio", 
  booktitle="Computer Vision -- ECCV 2020",
  year="2020",
  doi="10.1007/978-3-030-58536-5_22"
}

Table of Contents

Installation

The code in this repository has been verified to work on Ubuntu 18.04 with the following dependencies:

# General APT packages
sudo apt install \
  python3-pip python3-virtualenv python python3.8-dev g++-8 \
  ninja-build git libboost-container-dev unzip

# NVIDIA related packages
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /"
sudo apt install \
    nvidia-driver-455 nvidia-utils-455 `#driver, CUDA+GL libraries, utils` \
    cuda-runtime-10-1 cuda-toolkit-10-2 libcudnn7 `# Cuda and CUDNN`

To install CoReNet, you need to clone the code from GitHub and create a python virtual environment.

# Clone CoReNet
mkdir -p ~/prj/corenet
cd ~/prj/corenet
git clone https://github.com/google-research/corenet.git .

# Setup a python virtual environment
python3.8 -m virtualenv --python=/usr/bin/python3.8 venv_38
. venv_38/bin/activate
pip install -r requirements.txt

All instructions below assume that CoReNet lives in ~/prj/corenet, that this is the current working directory, and that the virtual environment is activated. You can also run CoReNet using the supplied docker file: ~/prj/corenet/Dockerfile.

Datasets

The CoReNet paper introduced several datasets with synthetic scenes. To reproduce the experiments in the paper you need to download them, using:

cd ~/prj/corenet
mkdir -p ~/prj/corenet/data/raw
for n in single pairs triplets; do  
  for s in train val test; do
    wget "https://storage.googleapis.com/gresearch/corenet/${n}.${s}.tar" \
      -O "data/raw/${n}.${s}.tar" 
    tar -xvf "data/raw/${n}.${s}.tar" -C data/ 
  done 
done

For each scene, these datasets provide the objects placement, a good view point, and two images rendered from it with a varying degree of realism. To download the actual object geometry, you need to download ShapeNetCore.v2.zip from ShapeNet's original site, unpack it, and convert the 3D meshes to CoReNet's binary format:

echo "Please download ShapeNetCore.v2.zip from ShapeNet's original site and "
echo "place it in ~/prj/corenet/data/raw/ before running the commands below"

cd ~/prj/corenet
unzip data/raw/ShapeNetCore.v2.zip -d data/raw/
PYTHONPATH=src python -m preprocess_shapenet \
  --shapenet_root=data/raw/ShapeNetCore.v2 \
  --output_root=data/shapenet_meshes

Models from the paper

To help reproduce the results from the CoReNet paper, we offer 5 pre-trained models from it (h5, h7, m7, m9, and y1; details below and in the paper). You can download and unpack these using:

cd ~/prj/corenet
wget https://storage.googleapis.com/gresearch/corenet/paper_tf_models.tgz \
  -O data/raw/paper_tf_models.tgz
tar xzvf data/raw/paper_tf_models.tgz -C data/

You can evaluate the downloaded models against their respective test sets using:

MODEL=h7  # Set to one of: h5, h7, m7, m9, y1

cd ~/prj/corenet
ulimit -n 4096
OMP_NUM_THREADS=2 CUDA_HOME=/usr/local/cuda-10.2 PYTHONPATH=src \
TF_CPP_MIN_LOG_LEVEL=1 PATH="${PATH}:${CUDA_HOME}/bin" \
FILL_VOXELS_CUDA_FLAGS=-ccbin=/usr/bin/gcc-8 \
python -m dist_launch --nproc_per_node=1 \
tf_model_eval --config_path=configs/paper_tf_models/${MODEL}.json5

To run on multiple GPUs in parallel, set --nproc_per_node to the number of desired GPUs. You can use CUDA_VISIBLE_DEVICES to control which GPUs exactly to use. CUDA_HOME, PATH, and FILL_VOXELS_CUDA_FLAGS control the just-in-time compiler for the voxelization operation.

Upon completion, quantitative results will be stored in ~/prj/corenet/output/paper_tf_models/${MODEL}/voxel_metrics.csv. Qualitative results will be available in ~/prj/corenet/output/paper_tf_models/${MODEL}/ in the form of PNG files.

This table summarizes the model attributes and their performance. More details can be found in the paper.

model dataset realism native resolution mean IoU
h5 single low 128 x 128 x 128 57.9%
h7 single high 128 x 128 x 128 59.1%
y1 single low 32 x 32 x 32 53.3%
m7 pairs high 128 x 128 x 128 43.1%
m9 triplets high 128 x 128 x 128 43.9%

Note that all models are evaluated on a grid resolution of 128 x 128 x 128, independent of their native resolution (see section 3.5 in the paper). The performance computed with this code matches the one reported in the paper for h5, h7, m7, and m9. For y1, the performance here is slightly higher (+0.2% IoU), as we no longer have the exact checkpoint used in the paper.

You can also run these models on individual images interactively, using the corenet_demo.ipynb notebook. For this, you need to also pip install jupyter-notebook in your virtual environment.

Training and evaluating a new model

We offer PyTorch code for training and evaluating models. To train a model, you need to (once) import the starting ResNet50 checkpoint:

cd ~/prj/corenet
PYTHONPATH=src python -m import_resnet50_checkpoint

Then run:

MODEL=h7  # Set to one of: h5, h7, m7, m9 

cd ~/prj/corenet
ulimit -n 4096
OMP_NUM_THREADS=2 CUDA_HOME=/usr/local/cuda-10.2 PYTHONPATH=src \
TF_CPP_MIN_LOG_LEVEL=1 PATH="${PATH}:${CUDA_HOME}/bin" \
FILL_VOXELS_CUDA_FLAGS=-ccbin=/usr/bin/gcc-8 \
python -m dist_launch --nproc_per_node=1 \
train --config_path=configs/models/h7.json5

Again, use --nproc_per_node and CUDA_VISIBLE_DEVICES to control parallel execution on multiple GPUs, CUDA_HOME, PATH, and FILL_VOXELS_CUDA_FLAGS control just-in-time compilation.

You can also evaluate individual checkpoints, for example:

cd ~/prj/corenet
ulimit -n 4096
OMP_NUM_THREADS=2 CUDA_HOME=/usr/local/cuda-10.2 PYTHONPATH=src \
TF_CPP_MIN_LOG_LEVEL=1 PATH="${PATH}:${CUDA_HOME}/bin" \
FILL_VOXELS_CUDA_FLAGS=-ccbin=/usr/bin/gcc-8 \
python -m dist_launch --nproc_per_node=1 eval \
  --cpt_path=output/models/h7/cpt/persistent/state_000000000.cpt \
  --output_path=output/eval_cpt_example \
  --eval_names_regex="short.*" \
  -jq '(.. | .config? | select(.num_qualitative_results != null) | .num_qualitative_results) |= 4' \

The -jq option limits the number of qualitative results to 4 (see also Further details section)

We currently offer checkpoints trained with this code for models h5, h7, m7, and m9, in this .tgz. These checkpoints achieve slightly better performance than the paper (see table below). This is likely due to a different distributed training strategy (synchronous here vs. asynchronous in the paper) and a different ML framework (PyTorch vs. TensorFlow in the paper).

h5 h7 m7 m9
mean IoU 60.2% 61.6% 45.0% 46.9%

Further details

Configuration files

The evaluation and training scripts are configured using JSON5 files that map to the TfModelEvalPipeline and TrainPipeline dataclasses in src/corenet/configuration.py. You can find description of the different configuration options in code comments, starting from these two classes.

You can also modify the configuration on the fly, through jq queries, as well as defines that change entries in the string_templates section. For example, the following options change the number of workers, and the prefetch factor of the data loaders, as well as the location of the data and the output directories:

... \
-jq "'(.. | .data_loader? | select(. != null) | .num_data_workers) |= 12'" \
    "'(.. | .data_loader? | select(. != null) | .prefetch_factor) |= 4'" \
-D 'data_dir=gs://some_gcs_bucket/data' \
   'output_dir=gs://some_gcs_bucket/output/models'

Dataset statistics

The table below summarizes the number of scenes in each dataset

single pairs triplets
train 883084 319981 80000
val 127286 45600 11400
test 246498 91194 22798

Licenses

The code and the checkpoints are released under the Apache 2.0 License. The datasets, the documentation, and the configuration files are licensed under the Creative Commons Attribution 4.0 International License.

Owner
Google Research
Google Research
Unbiased Learning To Rank Algorithms (ULTRA)

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels.

71 Dec 01, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 02, 2023
Pytorch Lightning Implementation of SC-Depth Methods.

SC_Depth_pl: This is a pytorch lightning implementation of SC-Depth (V1, V2) for self-supervised learning of monocular depth from video. In the V1 (IJ

JiaWang Bian 216 Dec 30, 2022
[NeurIPS2021] Code Release of Learning Transferable Perturbations

Learning Transferable Adversarial Perturbations This is an official release of the paper Learning Transferable Adversarial Perturbations. The code is

Krishna Kanth 17 Nov 11, 2022
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

Seja bem vindo à minha repo de Estudos em Python 3! Este é um repositório criado por um programador amador que estuda tópicos de finanças, estatística

32 Dec 24, 2022
Python code to generate art with Generative Adversarial Network

GAN_Canvas_Maker Generating Art using Generative Adversarial Network (GAN) Python code to generate art with Generative Adversarial Network: https://to

Jonny Banana 10 Aug 22, 2022
Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation

CorrNet This project provides the code and results for 'Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation'

Gongyang Li 13 Nov 03, 2022
PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

Hand Biomechanical Constraints Pytorch Unofficial PyTorch reimplementation of Hand-Biomechanical-Constraints (ECCV2020). This project reimplement foll

Hao Meng 59 Dec 20, 2022
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022
Repository for XLM-T, a framework for evaluating multilingual language models on Twitter data

This is the XLM-T repository, which includes data, code and pre-trained multilingual language models for Twitter. XLM-T - A Multilingual Language Mode

Cardiff NLP 112 Dec 27, 2022
Deep Learning Algorithms for Hedging with Frictions

Deep Learning Algorithms for Hedging with Frictions This repository contains the Forward-Backward Stochastic Differential Equation (FBSDE) solver and

Xiaofei Shi 3 Dec 22, 2022
the official implementation of the paper "Isometric Multi-Shape Matching" (CVPR 2021)

Isometric Multi-Shape Matching (IsoMuSh) Paper-CVF | Paper-arXiv | Video | Code Citation If you find our work useful in your research, please consider

Maolin Gao 9 Jul 17, 2022
An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

简介 通过PaddlePaddle框架复现了论文 Real-time Convolutional Neural Networks for Emotion and Gender Classification 中提出的两个模型,分别是SimpleCNN和MiniXception。利用 imdb_crop

8 Mar 11, 2022
This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

9 Sep 01, 2022
Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

OnsagerNet Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks This is the original pyTorch implemenati

Haijun.Yu 3 Aug 24, 2022
Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Video Corpus Moment Retrieval with Contrastive Learning PyTorch implementation for the paper "Video Corpus Moment Retrieval with Contrastive Learning"

ZHANG HAO 42 Dec 29, 2022