Datasets for new state-of-the-art challenge in disentanglement learning

Overview

High resolution disentanglement datasets

This repository contains the Falcor3D and Isaac3D datasets, which present a state-of-the-art challenge for controllable generation in terms of image resolution, photorealism, and richness of style factors, as compared to existing disentanglement datasets.

Falor3D

The Falcor3D dataset consists of 233,280 images based on the 3D scene of a living room, where each image has a resolution of 1024x1024. The meta code corresponds to all possible combinations of 7 factors of variation:

  • lighting_intensity (5)
  • lighting_x-dir (6)
  • lighting_y-dir (6)
  • lighting_z-dir (6)
  • camera_x-pos (6)
  • camera_y-pos (6)
  • camera_z-pos (6)

Note that the number m behind each factor represents that the factor has m possible values, uniformly sampled in the normalized range of variations [0, 1].

Each image has as filename padded_index.png where

index = lighting_intensity * 46656 + lighting_x-dir * 7776 + lighting_y-dir * 1296 + 
lighting_z-dir * 216 + camera_x-pos * 36 + camera_y-pos * 6 + camera_z-pos

padded_index = index padded with zeros such that it has 6 digits.

To see the Falcor3D images by varying each factor of variation individually, you can run

python dataset_demo.py --dataset Falor3D

and the results are saved in the examples/falcor3d_samples folder.

You can also check out the Falcor3D images here: falcor3d_samples_demo, which includes all the ground-truth latent traversals.

Isaac3D

The Isaac3D dataset consists of 737,280 images, based on the 3D scene of a kitchen, where each image has a resolution of 512x512. The meta code corresponds to all possible combinations of 9 factors of variation:

  • object_shape (3)
  • object_scale (4)
  • camera_height (4)
  • robot_x-movement (8)
  • robot_y-movement (5)
  • lighting_intensity (4)
  • lighting_y-dir (6)
  • object_color (4)
  • wall_color (4)

Similarly, the number m behind each factor represents that the factor has m possible values, uniformly sampled in the normalized range of variations [0, 1].

Each image has as filename padded_index.png where

index = object_shape * 245760 + object_scale * 30720 + camera_height * 6144 + 
robot_x-movement * 1536 + robot_y-movement * 384 + lighting_intensity * 96 + 
lighting_y-dir * 16 + object_color * 4 + wall color

padded_index = index padded with zeros such that it has 6 digits.

To see the Isaac3D images by varying each factor of variation individually, you can run

python dataset_demo.py --dataset Isaac3D

and the results are saved in the examples/isaac3d_samples folder.

You can also check out the Isaac3D images here: isaac3d_samples_demo, which includes all the ground-truth latent traversals.

Links to datasets

The two datasets can be downloaded from Google Drive:

  • Falcor3D (98 GB): link
  • Isaac3D (190 GB): link

Besides, we also provide a downsampled version (resolution 128x128) of the two datasets:

  • Falcor3D_128x128 (3.7 GB): link
  • Isaac3D_128x128 (13 GB): link

License

This work is licensed under a Creative Commons Attribution 4.0 International License by NVIDIA Corporation (https://creativecommons.org/licenses/by/4.0/).

Owner
NVIDIA Research Projects
NVIDIA Research Projects
PyTorch Implementation of the paper Learning to Reweight Examples for Robust Deep Learning

Learning to Reweight Examples for Robust Deep Learning Unofficial PyTorch implementation of Learning to Reweight Examples for Robust Deep Learning. Th

Daniel Stanley Tan 325 Dec 28, 2022
The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

Saxonica 11 Oct 23, 2022
Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

MSVCL_MICCAI2021 Installation Please follow the instruction in pytorch-CycleGAN-and-pix2pix to install. Example Usage An example of vendor-styles tran

Jaron Lee 11 Oct 19, 2022
Relative Uncertainty Learning for Facial Expression Recognition

Relative Uncertainty Learning for Facial Expression Recognition The official implementation of the following paper at NeurIPS2021: Title: Relative Unc

35 Dec 28, 2022
Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

Maths 70 Dec 09, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 05, 2022
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 03, 2022
Time Delayed NN implemented in pytorch

Pytorch Time Delayed NN Time Delayed NN implemented in PyTorch. Usage kernels = [(1, 25), (2, 50), (3, 75), (4, 100), (5, 125), (6, 150)] tdnn = TDNN

Daniil Gavrilov 79 Aug 04, 2022
Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

1.4k Jan 05, 2023
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
A repository that finds a person who looks like you by using face recognition technology.

Find Your Twin Hello everyone, I've always wondered how casting agencies do the casting for a scene where a certain actor is young or old for a movie

Cengizhan Yurdakul 3 Jan 29, 2022
Winners of DrivenData's Overhead Geopose Challenge

Winners of DrivenData's Overhead Geopose Challenge

DrivenData 22 Aug 04, 2022
Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning Installation

Pytorch Lightning 1.6k Jan 08, 2023
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements [paper (NeurIPS 2021)] [paper (arXiv)] [code] Authors: Zinan Lin, Vyas Sekar, Gi

Zinan Lin 32 Dec 16, 2022
Attention-driven Robot Manipulation (ARM) which includes Q-attention

Attention-driven Robotic Manipulation (ARM) This codebase is home to: Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation I

Stephen James 84 Dec 29, 2022
Artificial Neural network regression model to predict the energy output in a combined cycle power plant.

Energy_Output_Predictor Artificial Neural network regression model to predict the energy output in a combined cycle power plant. Abstract Energy outpu

1 Feb 11, 2022
Graph Robustness Benchmark: A scalable, unified, modular, and reproducible benchmark for evaluating the adversarial robustness of Graph Machine Learning.

Homepage | Paper | Datasets | Leaderboard | Documentation Graph Robustness Benchmark (GRB) provides scalable, unified, modular, and reproducible evalu

THUDM 66 Dec 22, 2022
Distributed Arcface Training in Pytorch

Distributed Arcface Training in Pytorch

3 Nov 23, 2021
Self-Supervised Deep Blind Video Super-Resolution

Self-Blind-VSR Paper | Discussion Self-Supervised Deep Blind Video Super-Resolution By Haoran Bai and Jinshan Pan Abstract Existing deep learning-base

Haoran Bai 35 Dec 09, 2022