Fast and Simple Neural Vocoder, the Multiband RNNMS

Last update: Jan 11, 2022

Related tags

Deep Learning MultibandRNNMS

Overview

Multiband RNN_MS

Fast and Simple vocoder, Multiband RNN_MS.

Demo
Quick training
How to Use
System Details
Results
References

Demo

ToDO: Link super great impressive high-quatity audio demo.

Quick Training

Jump to ☞ , then Run. That's all!

How to Use

1. Install

# pip install "torch==1.10.0" -q      # Based on your environment (validated with v1.10)
# pip install "torchaudio==0.10.0" -q # Based on your environment
pip install git+https://github.com/tarepan/MultibandRNNMS

2. Data & Preprocessing

"Batteries Included".
RNNMS transparently download corpus and preprocess it for you 😉

3. Train

python -m mbrnnms.main_train

For arguments, check ./mbrnnms/config.py

Advanced: Other datasets

You can switch dataset with arguments.
All speechcorpusy's preset corpuses are supported.

# LJSpeech corpus
python -m mbrnnms.main_train data.data_name=LJ

Advanced: Custom dataset

Copy mbrnnms.main_train and replace DataModule.

    # datamodule = LJSpeechDataModule(batch_size, ...)
    datamodule = YourSuperCoolDataModule(batch_size, ...)
    # That's all!

System Details

Model

PreNet: GRU
Upsampler: time-directional nearest interpolation
Decoder: Embedding-auto-regressive generative RNN with 10-bit μ-law encoding

Results

Output Sample

Demo

Performance

X [iter/sec] @ NVIDIA T4 on Google Colaboratory (AMP+, num_workers=8)

It takes about Ydays for full training.

References

Acknowlegements

: Basic vocoder concept came from this paper.
bshall/UniversalVocoding: Model and hyperparams are derived from this repository. All codes are re-written.

Fast and Simple Neural Vocoder, the Multiband RNNMS

Related tags

Overview

Multiband RNN_MS

Demo

Quick Training

How to Use

1. Install

2. Data & Preprocessing

3. Train

Advanced: Other datasets

Advanced: Custom dataset

System Details

Model

Results

Output Sample

Performance

References

Acknowlegements

Owner

tarepan

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

VR-Caps: A Virtual Environment for Active Capsule Endoscopy

Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

Pretraining on Dynamic Graph Neural Networks

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Copy Paste positive polyp using poisson image blending for medical image segmentation

Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

Rethinking Portrait Matting with Privacy Preserving

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

Python implementation of "Elliptic Fourier Features of a Closed Contour"

HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images

Official Repository for our ICCV2021 paper: Continual Learning on Noisy Data Streams via Self-Purified Replay

Code for IntraQ, PyTorch implementation of our paper under review

Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data