Unofficial PyTorch Implementation of Multi-Singer

Last update: Dec 28, 2022

Related tags

Deep Learning Multi-Singer

Overview

Multi-Singer

Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

Requirements

See requirements in requirement.txt:

linux
python 3.6
pytorch 1.0+
librosa
json, tqdm, logging

TODO

1026: upload code
1024: implement multi-singer & perceptual loss
1023: implement singer encoder

Getting started

Apply recipe to your own dataset

Put any wav files in data directory
Edit configuration in config/config.yaml

1. Pretrain

Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

Take modified FastSpeech for mel-spectrogram synthesis
Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Acknowledgements

Citation

Please cite this repository by the "Cite this repository" of About section (top right of the main page).

Question

Feel free to contact me at [email protected]

Unofficial PyTorch Implementation of Multi-Singer

Related tags

Overview

Multi-Singer

Requirements

TODO

Getting started

Apply recipe to your own dataset

1. Pretrain

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

3. Train

4. Inference

5. Singing Voice Synthesis

Acknowledgements

Citation

Question

Owner

SunMail-hub

"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

The project covers common metrics for super-resolution performance evaluation.

TakeInfoatNistforICS - Take Information in NIST NVD for ICS

Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Models, datasets and tools for Facial keypoints detection

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Pyeventbus: a publish/subscribe event bus

Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

DiffStride: Learning strides in convolutional neural networks

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

Code for our paper "Multi-scale Guided Attention for Medical Image Segmentation"

CLNTM - Contrastive Learning for Neural Topic Model

Contrastive Learning for Compact Single Image Dehazing, CVPR2021