Automatic voice-synthetised summaries of latest research papers on arXiv

Last update: Dec 20, 2022

Related tags

Overview

PaperWhisperer

PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv on a topic, by performing a keyword-based search. Then, it creates vocal summaries of the articles using Text-To-Speech and stores them to disk.

Installation

To install the package, move to the root of the repo and type in the console:

$ pip install .

If you plan to develop the package further, install the package in editable mode also installing the packages necessary to run unittests:

$ pip install -e .[test]

Testing

To run unittests, issue the following command from the root of the repo:

$ pytest

Package structure

The package is divided into 2 sub-packages:

retrieval
tts

retrieval contains data structures and facilities necessary to retrieve articles from arXiv. Under the hood, the app uses arxiv, a Python package that is a wrapper around the arXiv free API.

tts has facilities to generate speech renditions of text-based article summaries. The summary of an article consists of its title, authors, and abstract. Speech synthesis is performed using Google Cloud Text-To-Speech.

Setting up Google Cloud Text-To-Speech

PaperWhisperer uses Google Cloud Text-To-Speech to synthesise speech.

In order to be able to use this service, you should:

create an account on Google Cloud,
create a Cloud Platform project,
enable the Text-To-Speech API in the project
setup authentication
download a Json private key

More info on how to set up Google Cloud Text-To-Speech

Environment variables

The app uses an environment variable called GOOGLE_APPLICATION_CREDENTIALS to connect to Google Cloud Text-To-Speech safely.

In config.yml, set GOOGLE_APPLICATION_CREDENTIALS to the path of the Json private key you previously downloaded while setting up the Google service.

Without this step, you won't be able to connect to Google Cloud Text-To-Speech, and the app will throw an error.

How to create summaries

To create summaries for a keyword search, use the create_summaries entry point. This is the only console script of the package and the main entry point of the application.

Below is an example of how you can run the script:

$ create_summaries "generate chord progressions" 100 /save/dir 40

The script takes 4 positional arguments:

keywords used for searching articles (more than one keyword is possible)
maximum number of articles to retrieve
directory where to store vocal summaries
retrieve articles no older than this integer value in days

Dependencies

PaperWhisperer depends on the following packages:

arxiv==1.2.0
google-cloud-texttospeech
python-dotenv

YouTube video

Learn more about PaperWhisperer in this project presentation video on The Sound of AI YouTube channel.

Automatic voice-synthetised summaries of latest research papers on arXiv

Related tags

Overview

PaperWhisperer

Installation

Testing

Package structure

Setting up Google Cloud Text-To-Speech

Environment variables

How to create summaries

Dependencies

YouTube video

Owner

Valerio Velardo

Repositório criado para abrigar os notebooks com a listas de exercícios propostos pelo professor Gustavo Guanabara do canal Curso em Vídeo do YouTube durante o Curso de Python 3

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

FTIR-Deep Learning - FTIR Deep Learning With Python

Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Code for EMNLP 2021 paper: "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training"

A 1.3B text-to-image generation model trained on 14 million image-text pairs

Deep Reinforcement Learning for Keras.

1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Multivariate Boosted TRee

Some pvbatch (paraview) scripts for postprocessing OpenFOAM data

This project contains an implemented version of Face Detection using OpenCV and Mediapipe. This is a code snippet and can be used in projects.

PyTorchMemTracer - Depict GPU memory footprint during DNN training of PyTorch

《DeepViT: Towards Deeper Vision Transformer》(2021)

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

A light weight data augmentation tool for training CNNs and Viola Jones detectors