TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Last update: Nov 11, 2022

Overview

FOTS: Fast Oriented Text Spotting with a Unified Network

I am still working on this repo. updates and detailed instructions are coming soon!

Table of Contens

TensorFlow Versions
Other Requirements
Trained Models
Datasets
Train
- Pre-train with SynthText
- Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013
Test
References

TensorFlow Versions

As for now, the pre-training code is tested on TensorFlow 1.12, 1.14 and 1.15. I may try to implement 2.x version in the future.

Other Requirements

GCC >= 6

Trained Models

tmp pre-trained model
trained model comming soon

Datasets

pre-training
Synth800k(The dataset is only available for non-commercial research and educational purposes)
finetuning
ICDAR 2015, 2017MLT, 2013

Train

Pre-train with SynthText

Download pre-trained ResNet-50 from TensorFlow-Slim image classification model library page and place it at 'ckpt/resnet_v1_50' dir.

cd ckpt/resnet_v1_50
wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
tar -zxvf resnet_v1_50_2016_08_28.tar.gz
rm resnet_v1_50_2016_08_28.tar.gz

Download Synth800k dataset and place it at data/SynthText/ dir to pre-train the whole net.
Transform(Pre-process) the SynthText data into the ICDAR data format.

python data_provider/SynthText2ICDAR.py

Train with SynthText for 10 epochs(with 1 GPU).

python train.py \
  --max_steps=715625 \
  --gpu_list='0' \
  --checkpoint_path=ckpt/synthText_10eps/ \
  --pretrained_model_path=ckpt/resnet_v1_50/resnet_v1_50.ckpt \
  --training_img_data_dir=data/SynthText/ \
  --training_gt_data_dir=data/SynthText/ \
  --icdar=False \

Visualize pre-pretraining progress with TensorBoard.

tensorboard --logdir=ckpt/synthText_10eps/

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

(if you are using the pre-trained model, place all of the files in ckpt/synthText_10eps/)

Combine ICDAR data before training.
1. Place ICDAR data under tmp/ foler.
2. Run the following script to combine the data.
```
python combine_ICDAR_data.py --year [year of ICDAR to train(13 or 15 or 17)]
```

ICDAR 2017 MLT/pre-finetune for ICDAR 2013 or ICDAR 2015 (text detection task only)

Train the pre-trained model with 9,000 images from ICDAR 2017 MLT training and validation datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR17MLT/ \
  --pretrained_model_path=ckpt/synthText_10eps/ \
  --train_stage=0 \
  --training_img_data_dir=data/ICDAR17MLT/imgs/ \
  --training_gt_data_dir=data/ICDAR17MLT/gts/

ICDAR 2015

Train the model with 1,000 images from ICDAR 2015 training dataset and 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR15/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR15+13/imgs/ \
  --training_gt_data_dir=data/ICDAR15+13/gts/

ICDAR 2013(horizontal text only)

Train the model with 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR13/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR13/imgs/ \
  --training_gt_data_dir=data/ICDAR13/gts/

Test

Place some images in test_imgs/ dir and specify a trained checkpoint path to see the test result.

python test.py --test_data_path test_imgs/ --checkpoint_path [checkpoint path]

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Related tags

Overview

FOTS: Fast Oriented Text Spotting with a Unified Network

Table of Contens

TensorFlow Versions

Other Requirements

Trained Models

Datasets

Train

Pre-train with SynthText

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

Test

References

Owner

Masao Taketani

Pixel art search engine for opengameart

Creating of virtual elements of the graphical interface using opencv and mediapipe.

Text recognition (optical character recognition) with deep learning methods.

How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》；欢迎试用，关注，并反馈问题...

Repositório para registro de estudo da biblioteca opencv (Python)

OCR system for Arabic language that converts images of typed text to machine-encoded text.

OCR, Object Detection, Number Plate, Real Time

Opencv face recognition desktop application

Computer vision applications project (Flask and OpenCV)

A Python wrapper for the tesseract-ocr API

Smart computer vision application

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder

PianoVisuals - Create background videos synced with piano music using opencv

PAGE XML format collection for document image page content and more

Primary QPDF source code and documentation

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come