A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

  • pytorch 1.1+
  • torchvision 0.3+
  • pyclipper
  • opencv3
  • gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015:

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

  1. config the train_data_path,val_data_pathin config.json
  2. use following script to run
python3 train.py

Test

eval.py is used to test model on test dataset

  1. config model_path, img_path, gt_path, save_path in eval.py
  2. use following script to test
python3 eval.py

Predict

predict.py is used to inference on single image

  1. config model_path, img_path, in predict.py
  2. use following script to predict
python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Method image size (short size) learning rate Precision (%) Recall (%) F-measure (%) FPS
paper(resnet18) 736 x x x 80.4 26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-3 81.72 66.73 73.47 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-3 84.93 74.09 79.14 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-3 84.23 76.12 79.96 14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-4 75.14 57.34 65.04 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-4 83.89 69.23 75.86 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-4 85.29 75.1 79.87 14.22 (P100)
my (resnet18+FPN+pse扩张) 736 1e-3 76.50 74.70 75.59 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-3 71.82 75.73 73.72 10.67 (P100)
my (resnet18+FPN+pse扩张) 736 1e-4 74.19 72.34 73.25 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-4 78.96 76.27 77.59 10.67 (P100)

examples

todo

  • MobileNet backbone

  • ShuffleNet backbone

reference

  1. https://arxiv.org/pdf/1908.05900.pdf
  2. https://github.com/WenmuZhou/PSENet.pytorch

If this repository helps you,please star it. Thanks.

Owner
zhoujun
深度学习工程师,最近准备做端侧
zhoujun
Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

MLP-Mixer Pytorch reimplementation of Google's repository for the MLP-Mixer (Not yet updated on the master branch) that was released with the paper ML

Eunkwang Jeon 18 Dec 08, 2022
Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ)

Real2CAD-3DV Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ) Group Member: Yue Pan, Yuanwen Yue, Bingxin Ke, Yujie He

24 Jun 22, 2022
Platform-agnostic AI Framework 🔥

🇬🇧 TensorLayerX is a multi-backend AI framework, which can run on almost all operation systems and AI hardwares, and support hybrid-framework progra

TensorLayer Community 171 Jan 06, 2023
MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

Aleph Alpha GmbH 331 Jan 03, 2023
Neural Network to colorize grayscale images

#colornet Neural Network to colorize grayscale images Results Grayscale Prediction Ground Truth Eiji K used colornet for anime colorization Sources Au

Pavel Hanchar 3.6k Dec 24, 2022
[NeurIPS 2021] PyTorch Code for Accelerating Robotic Reinforcement Learning with Parameterized Action Primitives

Robot Action Primitives (RAPS) This repository is the official implementation of Accelerating Robotic Reinforcement Learning via Parameterized Action

Murtaza Dalal 55 Dec 27, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 08, 2023
A library for answering questions using data you cannot see

A library for computing on data you do not own and cannot see PySyft is a Python library for secure and private Deep Learning. PySyft decouples privat

OpenMined 8.5k Jan 02, 2023
Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis

HAABSAStar Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis". This project builds on the code from https://gith

1 Sep 14, 2020
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Facebook Research 366 Dec 28, 2022
This repo tries to recognize faces in the dataset you created

YÜZ TANIMA SİSTEMİ Bu repo oluşturacağınız yüz verisetlerini tanımaya çalışan ma

Mehdi KOŞACA 2 Dec 30, 2021
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch

Woncheol Shin 7 Sep 26, 2022
“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

ccks2021-track3 CCKS2021中文NLP地址相关性任务-赛道三-冠军方案 团队:我的加菲鱼- wodejiafeiyu 初赛第二/复赛第一/决赛第一 前言 19年开始,陆陆续续参加了一些比赛,拿到过一些top,比较懒一直都没分享过,这次比较幸运又拿了top1,打算分享下 分类的任务

shaochenjie 131 Dec 31, 2022
Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021.

SphereRPN Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021. Authors: Th

Thang Vu 15 Dec 02, 2022
Model Zoo of BDD100K Dataset

Model Zoo of BDD100K Dataset

ETH VIS Group 200 Dec 27, 2022
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 03, 2023
Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Predictive Modeling on Electronic Health Records(EHR) using Pytorch Overview Although there are plenty of repos on vision and NLP models, there are ve

81 Jan 01, 2023
NP DRAW paper released code

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation This repo contains the official implementation for the NP-DRAW paper.

ZENG Xiaohui 22 Mar 13, 2022
PyTorch module to use OpenFace's nn4.small2.v1.t7 model

OpenFace for Pytorch Disclaimer: This codes require the input face-images that are aligned and cropped in the same way of the original OpenFace. * I m

Pete Tae-hoon Kim 176 Dec 12, 2022
The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation The source code of our work "SIDE: Center-based Stereo 3D Detecto

10 Dec 18, 2022