Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Last update: Jan 02, 2023

Overview

Image Super-Resolution via Iterative Refinement

Brief

This is a unoffical implementation about Image Super-Resolution via Iterative Refinement(SR3) by Pytorch.

There are some implement details with paper description, which maybe different with actual SR3 structure due to details missing.

We used the ResNet block and channel concatenation style like vanilla DDPM.
We used the attention mechanism in low resolution feature(16×16) like vanilla DDPM.
We encoding the $\gamma$ as FilM strcutrue did in WaveGrad, and embedding it without affine transformation.

Status

Conditional generation(super resolution)

16×16 -> 128×128 on FFHQ-CelebaHQ
64×64 -> 512×512 on FFHQ-CelebaHQ

Unconditional generation

128×128 face generation on FFHQ
1024×1024 face generation by a cascade of 3 models

Training Step

log / logger
metrics evaluation
multi-gpu support
resume training / pretrained model

Results

We set the maximum reverse steps budget to 2000 now.

Tasks/Metrics	SSIM(+)	PSNR(+)	FID(-)	IS(+)
16×16 -> 128×128	0.675	23.26	-	-
64×64 -> 512×512			-	-
128×128	-	-
1024×1024	-	-

16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]

128×128 face generation on FFHQ [More Results]

Usage

Pretrained Model

This paper is based on "Denoising Diffusion Probabilistic Models", and we build both DDPM/SR3 network structure, which use timesteps/gama as model embedding input, respectively. In our experiments, SR3 model can achieve better visual results with same reverse steps and learning rate. You can select the json files with annotated suffix names to train different model.

Tasks	Google Drive
16×16 -> 128×128 on FFHQ-CelebaHQ	SR3
128×128 face generation on FFHQ	SR3

# Download the pretrain model and edit [sr|sample]_[ddpm|sr3]_[resolution option].json about "resume_state":
"resume_state": [your pretrain model path]

We have not trained the model until converged for time reason, which means there are a lot room to optimization.

Data Prepare

New Start

If you didn't have the data, you can prepare it by following steps:

Download the dataset and prepare it in LMDB or PNG format using script.

# Resize to get 16×16 LR_IMGS and 128×128 HR_IMGS, then prepare 128×128 Fake SR_IMGS by bicubic interpolation
python prepare.py  --path [dataset root]  --out [output root] --size 16,128 -l

then you need to change the datasets config to your data path and image resolution:

"datasets": {
    "train": {
        "dataroot": "dataset/ffhq_16_128", // [output root] in prepare.py script
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "lmdb", //lmdb or img, path of img files
    },
    "val": {
        "dataroot": "dataset/celebahq_16_128", // [output root] in prepare.py script
    }
},

Own Data

You also can use your image data by following steps.

At first, you should organize images layout like this:

# set the high/low resolution images, bicubic interpolation images path
dataset/celebahq_16_128/
├── hr_128
├── lr_16
└── sr_16_128

then you need to change the dataset config to your data path and image resolution:

"datasets": {
    "train|val": {
        "dataroot": "dataset/celebahq_16_128",
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "img", //lmdb or img, path of img files
    }
},

Training/Resume Training

# Use sr.py and sample.py to train the super resolution task and unconditional generation task, respectively.
# Edit json files to adjust network structure and hyperparameters
python sr.py -p train -c config/sr_sr3.json

Test/Evaluation

# Edit json to add pretrain model path and run the evaluation 
python sr.py -p val -c config/sr_sr3.json

Evaluation Alone

# Quantitative evaluation using SSIM/PSNR metrics on given dataset root
python eval.py -p [dataset root]

Acknowledge

Our work is based on the following theoretical works:

and we are benefit a lot from following projects:

Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Related tags

Overview

Image Super-Resolution via Iterative Refinement

Brief

Status

Conditional generation(super resolution)

Unconditional generation

Training Step

Results

16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]

128×128 face generation on FFHQ [More Results]

Usage

Pretrained Model

Data Prepare

New Start

Own Data

Training/Resume Training

Test/Evaluation

Evaluation Alone

Acknowledge

Owner

LiangWei Jiang

Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading

Code to go with the paper "Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian Monte Carlo"

PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Algorithms for outlier, adversarial and drift detection

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

DISTIL: Deep dIverSified inTeractIve Learning.

Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

Parallel Latent Tree-Induction for Faster Sequence Encoding

MLJetReconstruction - using machine learning to reconstruct jets for CMS

GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Neighborhood Reconstructing Autoencoders

Open & Efficient for Framework for Aspect-based Sentiment Analysis

A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Evaluation suite for large-scale language models.

[TPDS'21] COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider