AMUSE

AMUSE - financial summarization

Unzip data.zip

Train new model:

python FinAnalyze.py --task train --start 0 --count --modelpath data/models/new_model.h5 --train data/train --gold data/gold

data/train = dir where the text files are data/gold = dir where the gold summaries are

Trains new AMUSE prediction model for given files and stores it in an .h5 file

Generate summaries with existing model:

python FinAnalyze.py --task generate-summaries --start 0 --count --modelpath data/models/new_model.h5 --test data/test/ --summarydir data/summaries

Also stored:

a model trained on 3000 files named model.training.muse.3000.all.h5

If you use this code, please cite:

Litvak M, Vanetik N. Summarization of financial reports with AMUSE. In Proceedings of the 3rd Financial Narrative Processing Workshop 2021 (pp. 31-36).

@inproceedings{litvak2021summarization, title={Summarization of financial reports with AMUSE}, author={Litvak, Marina and Vanetik, Natalia}, booktitle={Proceedings of the 3rd Financial Narrative Processing Workshop}, pages={31--36}, year={2021} }

AMUSE - financial summarization

Related tags

Overview

AMUSE

Owner

justCTF [*] 2020 challenges sources

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Gpt2-WebAPI - The objective of this API is to provide the 3 best possible responses to sentences that the user would input via http GET request as a parameter

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Contact Extraction with Question Answering.

Chinese Grammatical Error Diagnosis

🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

Poetry PEP 517 Build Backend & Core Utilities

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

CoSENT 比Sentence-BERT更有效的句向量方案

Exploring dimension-reduced embeddings

Nested Named Entity Recognition for Chinese Biomedical Text

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

Big Bird: Transformers for Longer Sequences

In this project, we aim to achieve the task of predicting emojis from tweets. We aim to investigate the relationship between words and emojis.

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.

Command Line Text-To-Speech using Google TTS

Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.