Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Overview

Minimal PyTorch implementation of Generative Latent Optimization

This is a reimplementation of the paper

Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam:
Optimizing the Latent Space of Generative Networks

I'm not one of the authors. I just reimplemented parts of the paper in PyTorch for learning about PyTorch and generative models. Also, I liked the idea in the paper and was surprised that the approach actually works.

Implementation of the Laplacian pyramid L1 loss is inspired by https://github.com/mtyka/laploss. DCGAN network architecture follows https://github.com/pytorch/examples/tree/master/dcgan.

Running the code

First, install the required packages. For example, in Anaconda, you can simple do

conda install pytorch torchvision -c pytorch
conda install scikit-learn tqdm plac python-lmdb pillow

Download the LSUN dataset (only the bedroom training images are used here) into $LSUN_DIR. Then, simply run:

python glo.py $LSUN_DIR

You can learn more about the settings by running python glo.py --help.

Results

Unless mentioned otherwise, results are shown from a run over only a subset of the data (100000 samples - can be specified via the -n argument). Optimization was performed for only 25 epochs. The images below show reconstructions from the optimized latent space.

Results with 100-dimensional representation space look quite good, similar to the results shown in Fig. 1 in the paper.

python glo.py $LSUN_DIR -o d100 -gpu -d 100 -n 100000

Training for more epochs and from the whole dataset will make the images even sharper. Here are results (with 100D latent space) from a longer run of 50 epochs on the full dataset.

python glo.py $LSUN_DIR -o d100_full -gpu -d 100 -e 50

I'm not sure how many pyramid levels the authors used for the Laplacian pyramid L1 loss (here, we use 3 levels, but more might be better ... or not). But these results seem close enough.


Results with 512-dimensional representation space:

python glo.py $LSUN_DIR -o d512 -gpu -d 512 -n 100000

One of the main contributions of the paper is the use of the Laplacian pyramid L1 loss. Lets see how it compares to reconstructions using a simple L2 loss, again from 100-d representation space:

python glo.py $LSUN_DIR -o d100_l2 -gpu -d 512 -n 100000 -l l2


Comparison to L2 reconstruction loss, 512-d representation space:

python glo.py $LSUN_DIR -o d512_l2 -gpu -d 512 -n 100000 -l l2

I observed that initialization of the latent vectors with PCA is very crucial. Below are results from (normally distributed) random latent vectors. After 25 epochs, loss is only 0.31 (when initializing from PCA, loss after only 1 epoch is already 0.23). Reconstructions look really blurry.

python glo.py $LSUN_DIR -o d100_rand -gpu -d 100 -n 100000 -i random -e 500

It gets better after 500 epochs, but still very slow convergence and the results are not as clear as with PCA initialization.

Owner
Thomas Neumann
Thomas Neumann
Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

ConSeC is a novel approach to Word Sense Disambiguation (WSD), accepted at EMNLP 2021. It frames WSD as a text extraction task and features a feedback loop strategy that allows the disambiguation of

Sapienza NLP group 36 Dec 13, 2022
Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

Drone Detection using Thermal Signature This repository highlights the work for night-time drone detection using a using an Optris PI Lightweight ther

Chong Yu Quan 6 Dec 31, 2022
The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.

Object-Placement-Assessment-Dataset-OPA Object-Placement-Assessment (OPA) is to verify whether a composite image is plausible in terms of the object p

BCMI 53 Nov 15, 2022
This is a yolo3 implemented via tensorflow 2.7

YoloV3 - an object detection algorithm implemented via TF 2.x source code In this article I assume you've already familiar with basic computer vision

2 Jan 17, 2022
A fast MoE impl for PyTorch

An easy-to-use and efficient system to support the Mixture of Experts (MoE) model for PyTorch.

Rick Ho 873 Jan 09, 2023
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 03, 2022
Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Open Sesame This repository contains the code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge. Credits We built the project on t

9 Jul 24, 2022
SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

SymmetryNet SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images ACM Transactions on Gra

26 Dec 05, 2022
Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Graph Convolutional Gated Recurrent Neural Network (GCGRNN) Improved from Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF

Lei Lin 21 Dec 18, 2022
Code for Boundary-Aware Segmentation Network for Mobile and Web Applications

BASNet Boundary-Aware Segmentation Network for Mobile and Web Applications This repository contain implementation of BASNet in tensorflow/keras. comme

Hamid Ali 8 Nov 24, 2022
Large dataset storage format for Pytorch

H5Record Large dataset ( 100G, = 1T) storage format for Pytorch (wip) Support python 3 pip install h5record Why? Writing large dataset is still a

theblackcat102 43 Oct 22, 2022
A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

How to apply? Create your config.ini file following the example provided in config.ini Choose one of the options below to run: Run with Python3 pip in

Scientific Data Management Group 3 Jun 23, 2022
Pipeline code for Sequential-GAM(Genome Architecture Mapping).

Sequential-GAM Pipeline code for Sequential-GAM(Genome Architecture Mapping). mapping whole_preprocess.sh include the whole processing of mapping. usa

3 Nov 03, 2022
FishNet: One Stage to Detect, Segmentation and Pose Estimation

FishNet FishNet: One Stage to Detect, Segmentation and Pose Estimation Introduction In this project, we combine target detection, instance segmentatio

1 Oct 05, 2022
Pytorch implementation of Value Iteration Networks (NIPS 2016 best paper)

VIN: Value Iteration Networks A quick thank you A few others have released amazing related work which helped inspire and improve my own implementation

Kent Sommer 297 Dec 26, 2022
AFLNet: A Greybox Fuzzer for Network Protocols

AFLNet: A Greybox Fuzzer for Network Protocols AFLNet is a greybox fuzzer for protocol implementations. Unlike existing protocol fuzzers, it takes a m

626 Jan 06, 2023
Text Summarization - WCN — Weighted Contextual N-gram method for evaluation of Text Summarization

Text Summarization WCN — Weighted Contextual N-gram method for evaluation of Text Summarization In this project, I fine tune T5 model on Extreme Summa

Aditya Shah 1 Jan 03, 2022
Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

Patrick Varilly 28 Nov 25, 2022
A developer interface for creating Chat AIs for the Chai app.

ChaiPy A developer interface for creating Chat AIs for the Chai app. Usage Local development A quick start guide is available here, with a minimal exa

Chai 28 Dec 28, 2022
Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

FMEN Lowest memory consumption and second shortest runtime in NTIRE 2022 on Efficient Super-Resolution. Our paper: Fast and Memory-Efficient Network T

33 Dec 01, 2022