spaCy plugin for Transformers , Udify, ELmo, etc.

Last update: Nov 21, 2022

Related tags

Overview

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc.

Camphr is a Natural Language Processing library that helps in seamless integration for a wide variety of techniques from state-of-the-art to conventional ones. You can use Transformers , Udify, ELmo, etc. on spaCy.

Check the documentation for more information.

(For Japanese: https://qiita.com/tamurahey/items/53a1902625ccaac1bb2f)

Features

A spaCy plugin - Easily integration for a wide variety of methods
Transformers with spaCy - Fine-tuning pretrained model with Hydra. Embedding vector
Udify - BERT based multitask model in 75 languages
Elmo - Deep contextualized word representations
Rule base matching with Aho-Corasick, Regex
(for Japanese) KNP

License

Camphr is licensed under Apache 2.0.

Comments

NER Problem

Hello!

First of all I would like to thank you for the great work on lib Camphr. It's been very useful to me! Can you help me with this doubt? I used lib to train a name recognition model (ner) but when I load the model using nlp = (spacy.load ("~ / outputs // 2020-04-30 // 22-28-36 // models // 9 "), and I pass a text (doc = nlp (" I live in Brazil ")), I can't get any entity recognition (doc.ents >> ()). Could you tell me why this is happening?

opened by gabrielluz07 9

Gender and number subtags generation

I was comparing the default morpho-syntactic tags generated by camphr-udify and https://github.com/Hyperparticle/udify.

import spacy
import stanza
from spacy_conll import ConllFormatter

nlp=spacy.load("en_udify")
conllformatter = ConllFormatter(nlp)
nlp.add_pipe(conllformatter, last=True)

doc=nlp("Mother Teresa devoted her entire life to helping others") 
print(doc._.conll_str)

1	Mother	Mother	PROPN		_	2	compound	_	_
2	Teresa	Teresa	PROPN		_	3	nsubj	_	_
3	devoted	devote	VERB		_	0	root	_	_
4	her	her	PRON		_	6	nmod:poss	_	_
5	entire	entire	ADJ		_	6	amod	_	_
6	life	life	NOUN		_	3	obj	_	_
7	to	to	SCONJ		_	8	mark	_	_
8	helping	help	VERB		_	3	advcl	_	_
9	others	other	NOUN		_	8	obj	_	SpaceAfter=No

Tags returned by https://github.com/Hyperparticle/udify, for the same input.

prediction:  1  Mother  Mother  PROPN   _       Number=Sing     2       compound        _       _
2       Teresa  Teresa  PROPN   _       Number=Sing     3       nsubj   _       _
3       devoted devote  VERB    _       Mood=Ind|Tense=Past|VerbForm=Fin        0       root    _       _
4       her     her     PRON    _       Gender=Fem|Number=Sing|Person=3|Poss=Yes|PronType=Prs   6       nmod:poss      _                                               _
5       entire  entire  ADJ     _       Degree=Pos      6       amod    _       _
6       life    life    NOUN    _       Number=Sing     3       obj     _       _
7       to      to      SCONJ   _       _       8       mark    _       _
8       helping help    VERB    _       VerbForm=Ger    3       advcl   _       _
9       others  other   NOUN    _       Number=Plur     8       obj     _       _

Gender and number subtags are missing in camphr-udify. Could we have those included by default please?

thanks, Ranjita

enhancement

opened by ranjita-naik 6

Camphr+KNP returns an incorrect dependency tag when using a specific adposition.
Hello. I report a problem that is happened when analyzing universal dependencies in Japanese text using KNP. When I use a adposition “から”, camphr returns a following wrong result (that shows the conj dependency tag on NOUN→VERB, but an expectation result is the obl dependency tag on VERB→NOUN).

(Note that "再結晶", "留去" are the words I added manually, but other VERB words that existed in the original dictionary such as "除去", "撹拌" generates similarly incorrect results.) Same problems sometimes occur when using an adposition "と".

But using other adpositions, such as “より”, “にて”, camphr returns a correct result.

Environment:

Docker(python:3.7-buster)

spacy = 2.3.2

camphr = 0.6.5

pyknp = 0.4.5

Juman++ ver.1.02

KNP ver.4.19
opened by undermakingbook 6
Python 3.8

Camphr is currently pinned at python < 3.8, is there a specific reason for this and if so, what can we do to help?

Edit: sorry, I just saw #19, still, what can we do to help?

opened by Evpok 5
Support multi labels textcat pipe for transformers
closes #9

Add TrfForMultiLabelSequenceClassification for multiple text classification.

pipe name: transformers_multilabel_sequence_classifier

Add docs for fine-tuning multi textcat pipe

https://github.com/PKSHATechnology-Research/camphr/blob/feature%2Fmulti-textcat/docs/source/notes/finetune_transformers.rst#multilabel-text-classification

enhancement
opened by tamuhey 5
unofficial-udify, allennlp, and transformers conflicting dependencies

I'm trying to install udify on WSL as shown below.

$ pip install unofficial-udify==0.3.0 [email protected]://github.com/PKSHATechnology-Research/camphr_models/releases/download/0.7.0/en_udify-0.7.tar.gz

ERROR: Cannot install unofficial-udify and unofficial-udify==0.3.0 because these package versions have conflicting dependencies.

The conflict is caused by: unofficial-udify 0.3.0 depends on transformers<3.0.0 and >=2.3.0 allennlp 1.3.0 depends on transformers<4.1 and >=4.0 unofficial-udify 0.3.0 depends on transformers<3.0.0 and >=2.3.0 allennlp 1.2.2 depends on transformers<3.6 and >=3.4 unofficial-udify 0.3.0 depends on transformers<3.0.0 and >=2.3.0 allennlp 1.2.1 depends on transformers<3.5 and >=3.1 unofficial-udify 0.3.0 depends on transformers<3.0.0 and >=2.3.0 allennlp 1.2.0 depends on transformers<3.5 and >=3.1 unofficial-udify 0.3.0 depends on transformers<3.0.0 and >=2.3.0 allennlp 1.1.0 depends on transformers<3.1 and >=3.0

Is this a known issue? Could you suggest a workaroudn please?
bug

opened by ranjita-naik 3
Missing tag information

I noticed that the spacy tag field is empty. Is this a known issue? It looks like Udify supports some level of ufeats tagging (https://universaldependencies.org/u/feat/index.html)? I wonder if I'm supposed to b getting any of this in Spacy and I have a bug in my setup, or if it just isn't implemented yet? Would it be souced in token.tag like I'm thinking (if it does exist)?

I also noticed that displacy doesn't render the POS info. I am wondering if that is related?

BTW, just have to say that this is awesome.

opened by tslater 3
ImportError: cannot import name 'load_udify' from 'camphr.pipelines' following the example
I followed the example here: https://camphr.readthedocs.io/en/latest/notes/udify.html

I did only see the 0.7.0 model, so I went with that instead. Anyway, the German and English examples work great, but the Japanese one gives me this error:

>>> from camphr.pipelines import load_udify Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'load_udify' from 'camphr.pipelines' (/home/tyler/camphr/env/lib/python3.8/site-packages/camphr/pipelines/__init__.py)
opened by tslater 3
doc.ents empty, doc.is_nered == False

I followed the documentation to fine-tune the bert-base-cased (en) ner model and then made a spacy doc with text "Bob Jones and Barack Obama went up the hill in Wisconsin." but the resulting doc has doc.ents = () and doc.is_nered = False.

Am I missing something?

Thank you!

opened by jack-rory-staunton 3
Improvement for サ変 of KNP

Inside _get_child_dep(c), pos for 名詞,サ変名詞 is changed into VERB when it is followed by AUX. So now I think that _get_dep(tag[0]) should be done after _get_child_dep(c).

opened by KoichiYasuoka 3
Bump transformers from 3.0.2 to 4.1.1
Bumps transformers from 3.0.2 to 4.1.1.

Release notes

Sourced from transformers's releases.

Patch release: better error message & invalid trainer attribute

This patch releases introduces:

A better error message when trying to instantiate a SentencePiece-based tokenizer without having SentencePiece installed. #8881

Fixes an incorrect attribute in the trainer. #8996

Transformers v4.0.0: Fast tokenizers, model outputs, file reorganization

Transformers v4.0.0-rc-1: Fast tokenizers, model outputs, file reorganization

Breaking changes since v3.x

Version v4.0.0 introduces several breaking changes that were necessary.

1. AutoTokenizers and pipelines now use fast (rust) tokenizers by default.

The python and rust tokenizers have roughly the same API, but the rust tokenizers have a more complete feature set. The main breaking change is the handling of overflowing tokens between the python and rust tokenizers.

How to obtain the same behavior as v3.x in v4.x

The pipelines now contain additional features out of the box. See the token-classification pipeline with the grouped_entities flag.

The auto-tokenizers now return rust tokenizers. In order to obtain the python tokenizers instead, the user may use the use_fast flag by setting it to False:

In version v3.x:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("xxx")

to obtain the same in version v4.x:

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("xxx", use_fast=False)

2. SentencePiece is removed from the required dependencies

The requirement on the SentencePiece dependency has been lifted from the setup.py. This is done so that we may have a channel on anaconda cloud without relying on conda-forge. This means that the tokenizers that depend on the SentencePiece library will not be available with a standard transformers installation.

This includes the slow versions of:

XLNetTokenizer

AlbertTokenizer

CamembertTokenizer

MBartTokenizer

PegasusTokenizer

T5Tokenizer

ReformerTokenizer

XLMRobertaTokenizer

How to obtain the same behavior as v3.x in v4.x

Commits

bfa4ccf Release: v4.1.1

e0790cc Fix TAPAS doc

6d2e864 Put all models in the constants (#9170)

f83d9c8 v4.1.0 docs

f5438ab Release: v4.1.0

ac2c7e3 Remove erroneous character

77d6941 Fix gradient clipping for Sharded DDP (#9168)

1aca3d6 Add disclaimer to TAPAS rst file (#9167)

dc9f245 Torch scatter with torch 1.7.0

9a67185 Experimental support for fairscale ShardedDDP (#9139)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

@dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

Additionally, you can set the following in your Dependabot dashboard:

Update frequency (including time of day and day of week)

Pull request limits (per update run and/or open at any time)

Out-of-range updates (receive only lockfile updates, if desired)

Security updates (receive only security updates, if desired)

dependencies
opened by dependabot-preview[bot] 2
Bump certifi from 2021.5.30 to 2022.12.7 in /packages/camphr_pattern_search
Bumps certifi from 2021.5.30 to 2022.12.7.

Commits

9e9e840 2022.12.07

b81bdb2 2022.09.24

939a28f 2022.09.14

aca828a 2022.06.15.2

de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...

b8eb5e9 2022.06.15.1

47fb7ab Fix deprecation warning on Python 3.11 (#199)

b0b48e0 fixes #198 -- update link in license

9d514b4 2022.06.15

4151e88 Add py.typed to MANIFEST.in to package in sdist (#196)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump numpy from 1.21.0 to 1.22.0 in /packages/camphr_pattern_search
Bumps numpy from 1.21.0 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Releases(0.7.0)

0.7.0(Aug 21, 2020)
[dependencies] Bump pyknp from 0.4.4 to 0.4.5 #80

[dependencies] Bump spacy from 2.2.4 to 2.3.2 #81

[dependencies] Bump torch from 1.5.1 to 1.6.0 #82

[closed] move allennlp to camphr_allennlp #79

[dependencies] Bump hypothesis from 5.23.11 to 5.23.12 #73

[dependencies] Bump pytest from 5.4.3 to 6.0.1 #66

[closed] fix get_doc_char_span and covering span #78

[closed] fix index error #77

[closed] add lemma search to PatternSearch #76

[dependencies] Bump pytextspan from 0.2.2 to 0.3.0 #74

[closed] improve beamsearch performance for k ==1 #75

[closed] use pyknp #71

[closed] add normalizer to pattern search #70

[closed] Pattern searcher becomes able to search with lemma and lower #65

[closed] 形容詞接頭辞 into PART #63

[closed] fix deps #62

Source code(tar.gz)
Source code(zip)
0.6.0(Jul 9, 2020)
[dependencies] Bump scikit-learn from 0.22.2.post1 to 0.23.1 #61

[dependencies] Bump pytest from 5.3.2 to 5.4.3 #60

[closed] support allennlp v1 #59

[closed] Improvement for サ変 of KNP #56

[closed] refactor #55

Source code(tar.gz)
Source code(zip)
0.5.22(Apr 24, 2020)
[bug] fix transformers eval batchsize failure #50

Source code(tar.gz)
Source code(zip)
0.5.21(Apr 22, 2020)
[bug] Proper treatment of PUNCTs for KNP #48

Source code(tar.gz)
Source code(zip)
0.5.20(Apr 14, 2020)
[enhancement] dependency improvement for KNP #47

Thanks for contributing, @KoichiYasuoka!
Source code(tar.gz)
Source code(zip)
0.5.19(Apr 13, 2020)
[enhancement] update transformers dependency #46

[CI] Skip slow ci if unnecessary #45

[enhancement] Refactor/knp dependency parser #44

[enhancement] Tentative dependencies for KNP #43

Thanks for contributing, @KoichiYasuoka!
Source code(tar.gz)
Source code(zip)
0.5.18(Apr 10, 2020)
[enhancement] juman TAG_MAP tentative support #41

[bug] Fix misuse Vocab() in Language instantiation #42

Source code(tar.gz)
Source code(zip)
0.5.17(Apr 9, 2020)
[enhancement] Revert sentencepiece lang from v0.4 #40

Source code(tar.gz)
Source code(zip)
0.5.16(Apr 9, 2020)
[enhancement] add functools.lru_cache to knp extensions. #39

Source code(tar.gz)
Source code(zip)
0.5.15.dev0(Apr 8, 2020)

Source code(tar.gz)
Source code(zip)
0.5.15(Apr 8, 2020)

No changelog for this release.
Source code(tar.gz)
Source code(zip)
0.5.14(Apr 8, 2020)
[enhancement] tag and bunsetsu can be directly got from token #38

[enhancement] Feature/knp para noun chunks #37

[bug] fix noun chunker for para phrase #36

[enhancement][**refactor**] Refactor/knp noun chunker #35

Source code(tar.gz)
Source code(zip)
0.5.13(Apr 6, 2020)
Bug fix

Separate parallel clause in noun chunks into two or more chunks #34

Source code(tar.gz)
Source code(zip)
0.5.12(Apr 6, 2020)
New Features

Support knp noun chunker and knp dependency parser #33

Source code(tar.gz)
Source code(zip)
0.5.11(Mar 27, 2020)
New features

It is now possible to retrieve KNP result from spacy.doc (#31)

Source code(tar.gz)
Source code(zip)
0.5.10(Mar 18, 2020)

Removed the version restriction python<3.8. This will allow users to install camphr with python3.8, but macos users will fail. see (#29) for details.
Source code(tar.gz)
Source code(zip)
0.5.9(Mar 3, 2020)
Improvements

juman and knp now accepts longer text (#23)

Source code(tar.gz)
Source code(zip)
0.5.8(Mar 3, 2020)
Bug fix

fix transformers requirements (#24)

Source code(tar.gz)
Source code(zip)
0.5.7(Feb 21, 2020)
bug fix

fix camphr.utils.get_requirements_line

Source code(tar.gz)
Source code(zip)
0.5.5(Feb 21, 2020)
New features

Multi labels textcat pipe for transformers (#14)

Source code(tar.gz)
Source code(zip)
0.5.3(Feb 17, 2020)
New Features

Computing val loss in TorchLanguage.evaluate` #13

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://camphr.readthedocs.io/en/latest/

A repo for materials relating to the tutorial of CS-332 NLP

CS-332-NLP A repo for materials relating to the tutorial of CS-332 NLP Contents Tutorial 1: Introduction Corpus Regular expression Tokenization Tutori

9 Feb 15, 2022

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

ThinkTwice ThinkTwice is a retriever-reader architecture for solving long-text machine reading comprehension. It is based on the paper: ThinkTwice: A

4 Aug 06, 2021

spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

pretrain4ir_tutorial NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking 用作NLPIR实验室, Pre-training

12 Apr 07, 2022

Implementation of legal QA system based on SentenceKoBART

LegalQA using SentenceKoBART Implementation of legal QA system based on SentenceKoBART How to train SentenceKoBART Based on Neural Search Engine Jina

75 Dec 27, 2022

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation In this repo you can find the code of the Supervised Hybrid Audio Segmentatio

21 Dec 20, 2022

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.

1 Dec 09, 2021

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

1.1k Jan 07, 2023

NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

source code for NeurIPS21 paper robabilistic Margins for Instance Reweighting in Adversarial Training

9 Dec 20, 2022

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

43 Dec 23, 2022

Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.

Sonnet finder Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet. Usage This is a Python scrip

11 Sep 25, 2022

Artificial Conversational Entity for queries in Eulogio "Amang" Rodriguez Institute of Science and Technology (EARIST)

🤖 Coeus - EARIST A.C.E 💬 Coeus is an Artificial Conversational Entity for queries in Eulogio "Amang" Rodriguez Institute of Science and Technology,

3 Oct 14, 2022

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

smart-school-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

5 Oct 24, 2022

A retro text-to-speech bot for Discord

hawking A retro text-to-speech bot for Discord, designed to work with all of the stuff you might've seen in Moonbase Alpha, using the existing command

23 Dec 25, 2022

Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

11 Nov 17, 2022

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

* MY SOCIAL MEDIA : Programming And Memes Want to contact Mr. Error ? CONTACT : [ema

9 Jun 17, 2021

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of

39 Nov 14, 2022

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

ConferencingSpeech 2022 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech 2022 challenge. For more

21 Dec 02, 2022

Natural Language Processing for Adverse Drug Reaction (ADR) Detection

Natural Language Processing for Adverse Drug Reaction (ADR) Detection This repo contains code from a project to identify ADRs in discharge summaries a

21 Aug 05, 2022

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia. Its intended use is as input for neural models in natural languag

1.1k Jan 03, 2023

spaCy plugin for Transformers , Udify, ELmo, etc.

Related tags

Overview

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc.

Features

License

Comments

Patch release: better error message & invalid trainer attribute

Transformers v4.0.0: Fast tokenizers, model outputs, file reorganization

Transformers v4.0.0-rc-1: Fast tokenizers, model outputs, file reorganization

Breaking changes since v3.x

1. AutoTokenizers and pipelines now use fast (rust) tokenizers by default.

How to obtain the same behavior as v3.x in v4.x

2. SentencePiece is removed from the required dependencies

How to obtain the same behavior as v3.x in v4.x

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Releases(0.7.0)

0.7.0(Aug 21, 2020)

0.6.0(Jul 9, 2020)

0.5.22(Apr 24, 2020)

0.5.21(Apr 22, 2020)

0.5.20(Apr 14, 2020)

0.5.19(Apr 13, 2020)

0.5.18(Apr 10, 2020)

0.5.17(Apr 9, 2020)

0.5.16(Apr 9, 2020)

0.5.15.dev0(Apr 8, 2020)

0.5.15(Apr 8, 2020)

0.5.14(Apr 8, 2020)

0.5.13(Apr 6, 2020)

Bug fix

0.5.12(Apr 6, 2020)

New Features

0.5.11(Mar 27, 2020)

New features

0.5.10(Mar 18, 2020)

0.5.9(Mar 3, 2020)

Improvements

0.5.8(Mar 3, 2020)

Bug fix

0.5.7(Feb 21, 2020)

bug fix

0.5.5(Feb 21, 2020)

New features

0.5.3(Feb 17, 2020)

New Features

Owner

A repo for materials relating to the tutorial of CS-332 NLP

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

spaCy plugin for Transformers , Udify, ELmo, etc.

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

Implementation of legal QA system based on SentenceKoBART

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.

Artificial Conversational Entity for queries in Eulogio "Amang" Rodriguez Institute of Science and Technology (EARIST)

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

A retro text-to-speech bot for Discord

Speech Recognition for Uyghur using Speech transformer

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

Natural Language Processing for Adverse Drug Reaction (ADR) Detection

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio