Key information extraction from invoice document with Graph Convolution Network

Last update: Dec 16, 2022

Overview

Key Information Extraction from Scanned Invoices

Key information extraction from invoice document with Graph Convolution Network

Related blog post from my Viblo account: https://viblo.asia/p/djeZ1yPGZWz

Models

Background subtraction: U2Net
Image alignment: based-on output of text-detection & cv2
Text detection: CRAFT and an in-house text-detection model
Text recognition: VietOCR and an in-house text-recognition model
KIE: Graph Convolution

Currently, I dont have the invoice-direction classifier model. But you can also develop a model to rotate the image if the image is rotated horizontally or upside down.

Pretrained model

Google Drive

Data

MC-OCR, a Vietnamese receipts dataset: https://aihub.vn/competitions/1
Preprocessed data: Google Drive

Pipeline

TODO

Command

Create virtual environment using conda or virtualenv

# with virtualenv
virtualenv -p python3 invoice_env
# activate environment
source invoice_env/bin/activate
# install prerequisite libraries
pip install -r requirements.txt

# 1st command, run API
make serve
# 2nd command, run web-gui with streamlit
make runapp

Then access the localhost server at: 0.0.0.0:7778

Preview

TODO

Add preprocess data script

Reference

MC-OCR dataset: https://aihub.vn/competitions/1
U2Net: https://github.com/xuebinqin/U-2-Net
CRAFT: https://github.com/clovaai/CRAFT-pytorch
VietOCR: https://github.com/pbcquoc/vietocr
Benchmarking GNNs: https://github.com/graphdeeplearning/benchmarking-gnns
PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Key information extraction from invoice document with Graph Convolution Network

Related tags

Overview

Key Information Extraction from Scanned Invoices

Models

Pretrained model

Data

Pipeline

Command

Preview

TODO

Reference

Owner

Phan Hoang

Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

Pytorch code for our paper Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains)

Implementation of the pix2pix model on satellite images

Simple image captioning model - CLIP prefix captioning.

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Image segmentation with private İstanbul Dataset

This repository contains code released by Google Research.

This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Neural network for recognizing the gender of people in photos

Tgbox-bench - Simple TGBOX upload speed benchmark

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Python package for missing-data imputation with deep learning

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection

李云龙二次元风格化!打滚卖萌，使用了animeGANv2进行了视频的风格迁移

People movement type classifier with YOLOv4 detection and SORT tracking.