Quantifiers and Negations in RE Documents

Last update: Feb 01, 2022

Overview

Quantifiers-and-Negations-in-RE-Documents

This project was part of my work for a seminar at the Technical University of Munich (TUM) during my bachelor studies in 2019. The python project can be used to find quantifiers and negations in documents. It searches for problematic findings. Problematic findings are i.e. sentences that use specific combinations of quantifiers and negations that are ambiguous. This means there are multiple valid interpretations of the sentence. It can extract those and report them.

Motivation:

You want to avoid ambiguous sentences as they can cause problems that are hard to find and possibly hard to fix. This is especially the case for technical specifications and similar use cases. In this project we compare two different approaches to finding ambiguous sentences:

String based search
NLP based search

We want to find out if the computational overhead of using NLP gives better results than standard string based search methods.

Features:

Detect quantifiers and negations in .xml or .txt documents
Search either by a string based search or by NLP based search (using Stanfords CoreNLP library [1])
Extract possibly ambiguous sentences
Compare string search results with NLP search results

Prerequisites:

Java 8 or higher
Python 3.6 or higher as project interpreter
Stanford Corenlp library: https://stanfordnlp.github.io/CoreNLP/download.html
Environment variable "CORENLP_HOME" set to where the CoreNLP library is stored

References:

[1] Christopher D.Manning, MihaiSurdeanu, JohnBauer, JennyFinkel, StevenJ.Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60, 2014.

Quantifiers and Negations in RE Documents

Related tags

Overview

Quantifiers-and-Negations-in-RE-Documents

Owner

Nicolas Ruscher

iBOT: Image BERT Pre-Training with Online Tokenizer

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Pipeline for training LSA models using Scikit-Learn.

Uses Google's gTTS module to easily create robo text readin' on command.

Stand-alone language identification system

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Train 🤗-transformers model with Poutyne.

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Optimal Transport Tools (OTT), A toolbox for all things Wasserstein.

NSFW A chatbot based on GPT2-chitchat

ChatBotProyect - This is an unfinished project about a simple chatbot.

Club chatbot

A repo for materials relating to the tutorial of CS-332 NLP

A Structured Self-attentive Sentence Embedding

Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.

Python library for Serbian Natural language processing (NLP)

Input english text, then translate it between languages n times using the Deep Translator Python Library.

中文生成式预训练模型

Quantifiers and Negations in RE Documents

Related tags

Overview

Quantifiers-and-Negations-in-RE-Documents

Owner

Nicolas Ruscher

iBOT: Image BERT Pre-Training with Online Tokenizer

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Pipeline for training LSA models using Scikit-Learn.

Uses Google's gTTS module to easily create robo text readin' on command.

Stand-alone language identification system

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode

Train 🤗-transformers model with Poutyne.

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Optimal Transport Tools (OTT), A toolbox for all things Wasserstein.

**NSFW** A chatbot based on GPT2-chitchat

ChatBotProyect - This is an unfinished project about a simple chatbot.

Club chatbot

A repo for materials relating to the tutorial of CS-332 NLP

A Structured Self-attentive Sentence Embedding

Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.

Python library for Serbian Natural language processing (NLP)

Input english text, then translate it between languages n times using the Deep Translator Python Library.

中文生成式预训练模型

NSFW A chatbot based on GPT2-chitchat