This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Last update: Dec 22, 2022

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: https://papers.nips.cc/paper/6637-gated-recurrent-convolution-neural-network-for-ocr.pdf

Update

The journal version of GRCNN has been accepted by T-PAMI 2021, and the code is available at:

https://github.com/Jianf-Wang/GRCNN

Build

The GRCNN is built upon the CRNN. The requirements are:

Ubuntu 14.04
CUDA 7.5
CUDNN 5

For the convenience of compiling, we provide the dependencies from here: https://pan.baidu.com/s/1c21zl1e#list/path=%2F

It is more convenient if you use nivdia-docker image (@rremani supplied) : https://hub.docker.com/r/rremani/cuda_crnn_torch/

After installing the dependencies, go to src/ and execute build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Inference

We provide the pretrained model from here. Put the downloaded model file into directory model/GRCL/. Moreover, we provide the IC03 dataset in the "./data/IC03" directory. You need to change the directories listed in the "test.txt". The "test_label.txt" is the ground truth of each image. The "lexicon_50.txt" is the lexicon of IC03.

"src/evaluation.lua": Lexicon-free evaluation

"src/evaluation_lex.lua" Lexicon-based evaluation

The evaluation code will output the recognition accuracy.

Train a new model

Follow the following steps to train a new model on your own dataset.

Create a new LMDB dataset.src/create_own_dataset.py(need to pip install lmdb first).
You can modify the configuration in model/GRCL/GRCL_LSTM_pretrain.lua
Go to src/ and execute th main_train.lua ../model/GRCL/ ../model/saved_model. Model snapshots will be saved into ../model/saved_model.

Visualization

We visualize the RCNN , DenseNet and GRCNN to verify the dynamic receptive fields in GRCNN for OCR. There are clearly gaps among different characters, and for each character, the unrelated parts do not provide strong signal.

Citation

@inproceedings{jianfeng2017deep,
 author    = {Wang, Jianfeng and Hu, Xiaolin},
 title     = {Gated Recurrent Convolution Neural Network for OCR},
 booktitle = {Advances in Neural Information Processing Systems},
 year      = {2017}
}

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

Update

Build

Inference

Train a new model

Visualization

Citation

Owner

Open Source Differentiable Computer Vision Library for PyTorch

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Awesome Spectral Indices in Python.

Ackermann Line Follower Robot Simulation.

A small C++ implementation of LSTM networks, focused on OCR.

EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

Detect and fix skew in images containing text

An unofficial package help developers to implement ZATCA (Fatoora) QR code easily which required for e-invoicing

Crop regions in napari manually

Multi-choice answer sheet correction system using computer vision with opencv & python.

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

基于Paddle框架的PSENet复现

The virtual calculator will be above the live streaming from your camera

A program that takes in the hand gesture displayed by the user and translates ASL.

BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

OCR software for recognition of handwritten text

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。