Split Variational AutoEncoder

Last update: Sep 02, 2022

Related tags

Overview

Split-VAE

Split Variational AutoEncoder

Introduction

This repository contains and implemementation of a Split Variational AutoEncoder (SVAE). In a SVAE the output y is computed as a weighted sum

sigma * y1 + (1-sigma) * y2

where y1 and y2 are two distinct generated images, and sigma is a learned compositional map.

A Split VAE is trained as a normal VAE: no additional loss is added over the splitted images y1 and y2.

Splitting is meant to offer to the network a more flexible way to learn fruitful and independent features: as a result the variable collapse phenomenon is greatly reduced and the possibility of exploiting a larger number of latent variables improves the quality and diversity of generated samples.

Types of Splitting

The decomposition is nondeterministic, but follows two main schemes, that we may roughly categorize as either syntactical or semantical.

Syntactic decomposition

In this case, the compositional map tends to exploit the strong correlation between adjacent pixels, splitting the image in two complementary high frequency sub-images.

Below are some examples of syntactic splitting. In all the following pictures, the first row is the compositional map, then in order y1, y2 and y.

Semantic decomposition

In this case, the map typically focuses on the contours of objects, splitting the image in interesting variations of its content, with more marked and distinctive features.

Here are some examples of semantic splitting:

In case of sematic splitting, the Frèchet Inception Distance (FID) of y1 and y2 is frequently lower (hence better) than that of y, that clearly suffers from being the average of the formers.

In a sense, a SVAE forces the Variational Autoencoder to make choices, in contrast with its intrinsic tendency to average between alternatives with the aim to minimize the reconstruction loss towards a specific sample.

More examples of GENERATED images

Examples of Mnist-like gnerated digits (FID=7.47)

Here are some additional examples of semantic compositonal maps generated for CelebA, quite similar to drawings. The quality and precision of contours is both unexpected and remarkable.

And some generated faces (FID=35.1). Observe in particular the wide differentiation in pose, illumination, colors, age and expressions.

Split Variational AutoEncoder

Related tags

Overview

Split-VAE

Introduction

Types of Splitting

Syntactic decomposition

Semantic decomposition

More examples of GENERATED images

Owner

Andrea Asperti

NAVER BoostCamp Final Project

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

Add gui for YoloV5 using PyQt5

This is the pytorch code for the paper Curious Representation Learning for Embodied Intelligence.

Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

Graph Attention Networks

Implementation of ML models like Decision tree, Naive Bayes, Logistic Regression and many other

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Repo for EchoVPR: Echo State Networks for Visual Place Recognition

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

一套完整的微博舆情分析流程代码，包括微博爬虫、LDA主题分析和情感分析。

PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

TakeInfoatNistforICS - Take Information in NIST NVD for ICS

RobustVideoMatting and background composing in one model by using onnxruntime.

Pytorch implementation of DeePSiM

ISNAS-DIP: Image Specific Neural Architecture Search for Deep Image Prior [CVPR 2022]

State-to-Distribution (STD) Model

Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

A foreign language learning aid using a neural network to predict probability of translating foreign words