Bnagla hand written document digiiztion

Last update: Dec 10, 2021

Related tags

Overview

Bnagla hand written document digiiztion

This repo addresses the problem of digiizing hand written documents in Bangla. Documents have definite fields of specific information. We target this area and crop this region.

We only focus on extracting amount information (in currency) which is important in tax return. Our approach first select characters and separates numbers from non-number characters. The final classification results of each character are merged to get full amount.

Result

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Owner

Mushfiqur Rahman

Greater world Shorter time ....

GitHub Repository

A simple chatbot based on chatterbot that you can use for anything has basic features

Chatbotium A simple chatbot based on chatterbot that you can use for anything has basic features. I have some errors Read the paragraph below: Known b

1 Feb 16, 2022

Repository for Project Insight: NLP as a Service

Project Insight NLP as a Service Contents Introduction Features Installation Setup and Documentation Project Details Demonstration Directory Details H

286 Dec 06, 2022

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nij

122 Nov 17, 2022

DiY Oxygen Concentrator based on the OxiKit

M19O2 DiY Oxygen Concentrator based on / inspired by the OxiKit, OpenOx, Marut, RepRap and Project Apollo platforms. About Read about the project on H

62 Dec 22, 2022

Data preprocessing rosetta parser for python

datapreprocessing_rosetta_parser I've never done any NLP or text data processing before, so I wanted to use this hackathon as a learning opportunity,

2 Nov 28, 2021

The following links explain a bit the idea of semantic search and how search mechanisms work by doing retrieve and rerank

Main Idea The following links explain a bit the idea of semantic search and how search mechanisms work by doing retrieve and rerank Semantic Search Re

2 Jan 28, 2022

A fast and lightweight python-based CTC beam search decoder for speech recognition.

pyctcdecode A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support

315 Dec 21, 2022

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Transformers are all you need In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a

8 Apr 13, 2022

Bnagla hand written document digiiztion

Related tags

Overview

Bnagla hand written document digiiztion

Result

Contributing

License

Owner

Mushfiqur Rahman

A simple chatbot based on chatterbot that you can use for anything has basic features

Repository for Project Insight: NLP as a Service

DiY Oxygen Concentrator based on the OxiKit

Data preprocessing rosetta parser for python

The following links explain a bit the idea of semantic search and how search mechanisms work by doing retrieve and rerank

A fast and lightweight python-based CTC beam search decoder for speech recognition.

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Yet Another Neural Machine Translation Toolkit

Reformer, the efficient Transformer, in Pytorch

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

CLIPfa: Connecting Farsi Text and Images

结巴中文分词

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

Unsupervised text tokenizer focused on computational efficiency

Get list of common stop words in various languages in Python

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training