A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Last update: Jan 05, 2023

Related tags

Overview

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Jianqi Ma, Zhetong Liang, Lei Zhang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China & OPPO Research

Recovering TextZoom samples

Environment:

Other possible python packages like pyyaml, cv2, Pillow and imgaug

Main idea

The pipeline

TP Interpreter

Configure your training

Download the pretrained recognizer from:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Unzip the codes and walk into the ' $TATT_ROOT$ /', place the pretrained weights from recognizer in ' $TATT_ROOT$ /'.

Download the TextZoom dataset:

https://github.com/JasonBoy1/TextZoom

Train the corresponding model (e.g. TPGSR-TSRN):

chmod a+x train_TATT.sh
./train_TATT.sh

Run the test-prefixed shell to test the corresponding model.

Adding '--go_test' in the shell file

Cite this paper:

@article{ma2021text,
title={A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution},
author={Ma, Jianqi and Zhetong, Liang and Zhang, Lei},
journal={},
year={2022}
}

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Related tags

Overview

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Recovering TextZoom samples

Environment:

Main idea

The pipeline

TP Interpreter

Configure your training

Download the pretrained recognizer from:

Download the TextZoom dataset:

Train the corresponding model (e.g. TPGSR-TSRN):

Run the test-prefixed shell to test the corresponding model.

Cite this paper:

Owner

MA Jianqi, shiki

Visualizing Yolov5's layers using GradCam

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

JittorVis - Visual understanding of deep learning models

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

A simple Tensorflow based library for deep and/or denoising AutoEncoder.

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Check out the StyleGAN repo and place it in the same directory hierarchy as the present repo

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

MlTr: Multi-label Classification with Transformer

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

Natural Intelligence is still a pretty good idea.

A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

A PyTorch Implementation of FaceBoxes

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Deploy optimized transformer based models on Nvidia Triton server

An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

Basit bir burç modülü.