Revisiting Global Statistics Aggregation for Improving Image Restoration

Last update: Dec 24, 2022

Related tags

Deep Learning tlsc

Overview

Revisiting Global Statistics Aggregation for Improving Image Restoration

Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu

Paper: https://arxiv.org/pdf/2112.04491.pdf

Introduction

This repository is an official implementation of the TLSC. We propose Test-time Local Statistics Converter (TLSC), which replaces the statistic aggregation region from the entire spatial dimension to the local window, to mitigate the issue between training and testing. Our approach has no requirement of retraining or finetuning, and only induces marginal extra costs.

Illustration of training and testing schemes of image restoration. From left to right: image from the dataset; input for the restorer (patches or entire-image depend on the scheme); aggregating statistics from the feature map. For (a), (b), and (c), statistics are aggregated along the entire spatial dimension. (d) Ours, statistics are aggregated in a local region for each pixel.

Abstract

Global spatial statistics, which are aggregated along entire spatial dimensions, are widely used in top-performance image restorers. For example, mean, variance in Instance Normalization (IN) which is adopted by HINet, and global average pooling (ie, mean) in Squeeze and Excitation (SE) which is applied to MPRNet. This paper first shows that statistics aggregated on the patches-based/entire-image-based feature in the training/testing phase respectively may distribute very differently and lead to performance degradation in image restorers. It has been widely overlooked by previous works. To solve this issue, we propose a simple approach, Test-time Local Statistics Converter (TLSC), that replaces the region of statistics aggregation operation from global to local, only in the test time. Without retraining or finetuning, our approach significantly improves the image restorer's performance. In particular, by extending SE with TLSC to the state-of-the-art models, MPRNet boost by 0.65 dB in PSNR on GoPro dataset, achieves 33.31 dB, exceeds the previous best result 0.6 dB. In addition, we simply apply TLSC to the high-level vision task, ie, semantic segmentation, and achieves competitive results. Extensive quantity and quality experiments are conducted to demonstrate TLSC solves the issue with marginal costs while significant gain.

Usage

Installation

This implementation based on BasicSR which is a open source toolbox for image/video restoration tasks.

git clone https://github.com/megvii-research/tlsc.git
cd tlsc
pip install -r requirements.txt
python setup.py develop

Quick Start (Single Image Inference)

python basicsr/demo.py -opt options/demo/demo.yml
- modified your input and output path
- define network
- pretrained model, it should match the define network.
  - for pretrained model, see here

Main Results

Method	GoPro	GoPro	HIDE	HIDE	REDS	REDS
	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
HINet	32.71	0.959	30.33	0.932	28.83	0.863
HINet-local (ours)	33.08	0.962	30.66	0.936	28.96	0.865
MPRNet	32.66	0.959	30.96	0.939	-	-
MPRNet-local (ours)	33.31	0.964	31.19	0.942	-	-

Evaluation

Image Deblur - GoPro dataset (Click to expand)

prepare data
- mkdir ./datasets/GoPro
- download the test set in ./datasets/GoPro/test (refer to MPRNet)
- it should be like:
```
./datasets/
./datasets/GoPro/test/
./datasets/GoPro/test/input/
./datasets/GoPro/test/target/
```
eval
- download pretrained HINet to ./experiments/pretrained_models/HINet-GoPro.pth
- python basicsr/test.py -opt options/test/HIDE/MPRNetLocal-HIDE.yml
- download pretrained MPRNet to ./experiments/pretrained_models/MPRNet-GoPro.pth
- python basicsr/test.py -opt options/test/HIDE/MPRNetLocal-HIDE.yml

Image Deblur - HIDE dataset (Click to expand)

prepare data
- mkdir ./datasets/HIDE
- download the test set in ./datasets/HIDE/test (refer to MPRNet)
- it should be like:
```
./datasets/
./datasets/HIDE/test/
./datasets/HIDE/test/input/
./datasets/HIDE/test/target/
```
eval
- download pretrained HINet to ./experiments/pretrained_models/HINet-GoPro.pth
- python basicsr/test.py -opt options/test/GoPro/MPRNetLocal-GoPro.yml
- download pretrained MPRNet to ./experiments/pretrained_models/MPRNet-GoPro.pth
- python basicsr/test.py -opt options/test/GoPro/MPRNetLocal-GoPro.yml

Image Deblur - REDS dataset (Click to expand)

prepare data
- mkdir ./datasets/REDS
- download the val set from val_blur, val_sharp to ./datasets/REDS/ and unzip them.
- it should be like
```
./datasets/
./datasets/REDS/
./datasets/REDS/val/
./datasets/REDS/val/val_blur_jpeg/
./datasets/REDS/val/val_sharp/
```
- python scripts/data_preparation/reds.py
  - flatten the folders and extract 300 validation images.
eval
- download pretrained HINet to ./experiments/pretrained_models/HINet-REDS.pth
- python basicsr/test.py -opt options/test/REDS/HINetLocal-REDS.yml

Tricks: Change the 'fast_imp: false' (naive implementation) to 'fast_imp: true' (faster implementation) in MPRNetLocal config can achieve faster inference speed.

License

This project is under the MIT license, and it is based on BasicSR which is under the Apache 2.0 license.

Citations

If TLSC helps your research or work, please consider citing TLSC.

@article{chu2021tlsc,
  title={Revisiting Global Statistics Aggregation for Improving Image Restoration},
  author={Chu, Xiaojie and Chen, Liangyu and and Chen, Chengpeng and Lu, Xin},
  journal={arXiv preprint arXiv:2112.04491},
  year={2021}
}

Contact

If you have any questions, please contact [email protected] or [email protected].

Revisiting Global Statistics Aggregation for Improving Image Restoration

Related tags

Overview

Revisiting Global Statistics Aggregation for Improving Image Restoration

Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu

Paper: https://arxiv.org/pdf/2112.04491.pdf

Introduction

Abstract

Usage

Installation

Quick Start (Single Image Inference)

Main Results

Evaluation

License

Citations

Contact

Owner

MEGVII Research

curl-impersonate: A special compilation of curl that makes it impersonate Chrome & Firefox

SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

Human4D Dataset tools for processing and visualization

A PyTorch port of the Neural 3D Mesh Renderer

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Aspect-Sentiment-Multiple-Opinion Triplet Extraction (NLPCC 2021)

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

PyTorch implementation for OCT-GAN Neural ODE-based Conditional Tabular GANs (WWW 2021)

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

A curated list of Generative Deep Art projects, tools, artworks, and models

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

A playable implementation of Fully Convolutional Networks with Keras.

An implementation of shampoo

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation