Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Last update: Oct 10, 2022

Overview

Scene Text-Spotting based on PSEnet+CRNN

Pytorch implementation of an end to end Text-Spotter with a PSEnet text detector and CRNN text recognizer. We plan to grow this repository into an open research platform for multi-lingual text detection and recognition from natural scene images, targeted towards low-resource languages.

Requirements

Python 3.6.5
Pytorch 1.2
pyclipper
Polygon 3.0.8
OpenCV 3.4.1

Demo

Download the trained CRNN and PSEnet models from the links provided below.
Copy paths of the models and paste them in params.py
run end-end.py

python end-end.py --img [path to image] --e2e_config_name [end to end config name]

Pre-trained Models

Both PSEnet and CRNN pre-trained models can be found here: gdrive

the PSEnet model is a multi-lingual text detector, trained on MLT 2019. Works quite well!
the CRNN recognizes Hindi, Bangla, Malayalam, Kanada, Tamil, Telugu, Odia, Sanskrit, Marathi!

Download the models in models/ directory and modify params.py if required.

Training instructions

To train your own detection model refer to this file.
To train your own recognition model refer to this file.

Samples

Contributors

Azhar Shaikh, PES University LinkedIn
Nishant Sinha, OffNote Labs

Work done as part of Internship with OffNote Labs.

References

If this repository helps you, please star it. Thank you!

Pytorch implementation of PSEnet with Pyramid Attention Network as feature extractor

Related tags

Overview

Scene Text-Spotting based on PSEnet+CRNN

Requirements

Demo

Pre-trained Models

Training instructions

Samples

Contributors

References

Owner

azhar shaikh

The papers published in top-tier AI conferences in recent years.

A real-time dolly zoom camera effect

Optical character recognition for Japanese text, with the main focus being Japanese manga

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Learning Camera Localization via Dense Scene Matching, CVPR2021

Automatically remove the mosaics in images and videos, or add mosaics to them.

Learn computer graphics by writing GPU shaders!

Detect and fix skew in images containing text

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Table Extraction Tool

An organized collection of tutorials and projects created for aspriring computer vision students.

Smart computer vision application

PianoVisuals - Create background videos synced with piano music using opencv

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

Python Computer Vision Aim Bot for Roblox's Phantom Forces

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

Ackermann Line Follower Robot Simulation.

Opencv face recognition desktop application