Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Last update: Dec 21, 2022

Related tags

Deep Learning rotated-box-is-back

Overview

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

This material is supplementray code for paper accepted in ICDAR 2021

We highly recommend to use docker image because our model contains custom operation which depends on framework and cuda version.
We provide trained model for ICDAR 2017, 2013 which is in final_checkpoint_ch8 and for ICDAR 2015 which is in final_checkpoint_ch4
This code is mainly focused on inference. To train our model, training gpu like V100 is needed. please check our paper in detail.

REQUIREMENT

Nvidia-docker
Tensorflow 1.14
Miminum GPU requirement : NVIDIA GTX 1080TI

INSTALLATION

Make docker image and container

docker build --tag rbimage ./dockerfile
docker run --runtime=nvidia --name rbcontainer -v /rotated-box-is-back-path:/rotated-box-is-back -i -t rbimage /bin/bash

build custom operations in container

cd /rotated-box-is-back/nms 
cmake ./
make
./shell.sh

SAMPLE IMAGE INFERENCE

cd /rotated-box-is-back/
python viz.py --test_data_path=./sample --checkpoint_path=./final_checkpoint_ch8 --output_dir=./sample_result  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2017 INFERENCE

please replace icdar_testset_path to your-icdar-2017-testset-folder path.

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic17  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2015 INFERENCE

please replace icdar_testset_path to your-icdar-2015-testset-folder path.
To converting evalutation format. Convert result text file like below

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch4 --output_dir=./ic15  --thres 0.7 --min_size=1100 --max_size=2000
python text_postprocessing.py -i=./ic15/ -o=./ic15_format/ -e True

ICDAR 2013 INFERENCE

please replace icdar_testset_path to your-icdar-2013-testset-folder path.
To converting evalutation format. Convert result text file like below

python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic13  --thres 0.55 --min_size=700 --max_size=900
python text_postprocessing.py -i=./ic13/ -o=./ic13_format/ -e True -m rec

EVALUATION TABLE

IC13			IC15			IC17
P	R	F	P	R	F	P	R	F
95.9	89.1	92.4	89.7	84.2	86.9	83.4	68.2	75.0

TRAINING

It can be trained below command line

python train_refine_estimator.py --input_size=1024 --batch_size=2 --checkpoint_path=./finetuning --training_data_path=your-image-path --training_gt_path=your-gt-path  --learning_rate=0.00001 --max_epochs=500  --save_summary_steps=1000 --warmup_path=./final_checkpoint_ch8

ACKNOWLEDGEMENT

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 1711125972, Audio-Visual Perception for Autonomous Rescue Drones).

CITATION

If you found it is helpfull for your research, please cite:

Lee J., Lee J., Yang C., Lee Y., Lee J. (2021) Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection. In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_4

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Related tags

Overview

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

This material is supplementray code for paper accepted in ICDAR 2021

REQUIREMENT

INSTALLATION

SAMPLE IMAGE INFERENCE

ICDAR 2017 INFERENCE

ICDAR 2015 INFERENCE

ICDAR 2013 INFERENCE

EVALUATION TABLE

TRAINING

ACKNOWLEDGEMENT

CITATION

Owner

NCSOFT

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Multi-Output Gaussian Process Toolkit

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Scalable Multi-Agent Reinforcement Learning

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

Sequence Modeling with Structured State Spaces

An official implementation of MobileStyleGAN in PyTorch

DenseNet Implementation in Keras with ImageNet Pretrained Models

MicroNet: Improving Image Recognition with Extremely Low FLOPs (ICCV 2021)

MaskTrackRCNN for video instance segmentation based on mmdetection

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

A collection of Google research projects related to Federated Learning and Federated Analytics.

DTCN SMP Challenge - Sequential prediction learning framework and algorithm

Efficient Sparse Attacks on Videos using Reinforcement Learning

SatelliteSfM - A library for solving the satellite structure from motion problem

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".