Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Last update: Dec 05, 2022

Overview

Table of Content

Introduction
Getting Started
- Datasets
- Installation
Experiments

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images

Recovering the 3D structure of an object from a single image is a challenging task due to its ill-posed nature. One approach is to utilize the plentiful photos of the same object category to learn a strong 3D shape prior for the object. We propose a general framework without symmetry constraint, called LeMul, that effectively Learns from Multi-image datasets for more flexible and reliable unsupervised training of 3D reconstruction networks. It employs loose shape and texture consistency losses based on component swapping across views.

Details of the model architecture and experimental results can be found in our following paper.

@inproceedings{ho2021lemul,
      title={Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images},
      author={Long-Nhat Ho and Anh Tran and Quynh Phung and Minh Hoai},
      booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
      year={2021}
}

Please CITE our paper whenever our model implementation is used to help produce published results or incorporated into other software.

Getting Started

Datasets

CelebA face dataset. Please download the original images (img_celeba.7z) from their website and run celeba_crop.py in data/ to crop the images.
Synthetic face dataset generated using Basel Face Model. This can be downloaded using the script download_synface.sh provided in data/.
Cat face dataset composed of Cat Head Dataset and Oxford-IIIT Pet Dataset (license). This can be downloaded using the script download_cat.sh provided in data/.
CASIA WebFace dataset. You can download the original dataset from backup links such as the Google Drive link on this page. Decompress, and run casia_data_split.py in data/ to re-organize the images.

Please remember to cite the corresponding papers if you use these datasets.

Installation:

# clone the repo
git clone https://github.com/VinAIResearch/LeMul.git
cd LeMul

# install dependencies
conda env create -f environment.yml

Experiments

Training and Testing

Check the configuration files in experiments/ and run experiments, eg:

# Training
python run.py --config experiments/train_multi_CASIA.yml --gpu 0 --num_workers 4

# Testing
python run.py --config experiments/test_multi_CASIA.yml --gpu 0 --num_workers 4

Texture fine-tuning

With collection-style datasets such as CASIA, you can fine-tune the texture estimation network after training. Check the configuration file experiments/finetune_CASIA.yml as an example. You can run it with the command:

python run.py --config experiments/finetune_CASIA.yml --gpu 0 --num_workers 4

Pretrained Models

Pretrained models can be found here: Google Drive Please download and place pretrained models in ./pretrained folder.

Demo

After downloading pretrained models and preparing input image folder, you can run demo, eg:

python demo/demo.py --input demo/human_face_cropped --result demo/human_face_results --checkpoint pretrained/casia_checkpoint028.pth

Options:

--config path-to-training-config-file.yml: input the config file used in training (recommended)
--detect_human_face: enable automatic human face detection and cropping using MTCNN. You need to install facenet-pytorch before using this option. This only works on human face images
--gpu: enable GPU
--render_video: render 3D animations using neural_renderer (GPU is required)

To replicate the results reported in the paper with the model pretrained on the CASIA dataset, use the --detect_human_face option with images in folder demo/images/human_face and skip that flag with images in demo/images/human_face_cropped.

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Related tags

Overview

Table of Content

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images

Getting Started

Datasets

Installation:

Experiments

Training and Testing

Texture fine-tuning

Pretrained Models

Demo

Owner

VinAI Research

[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

PlaidML is a framework for making deep learning work everywhere.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

Luminaire is a python package that provides ML driven solutions for monitoring time series data.

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching

Implementation of Sequence Generative Adversarial Nets with Policy Gradient

A PyTorch implementation of the Transformer model in "Attention is All You Need".

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Select, weight and analyze complex sample data

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.