The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Last update: Jan 03, 2023

Related tags

Deep Learning WSRGlow

Overview

WSRGlow

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio samples can be found here.

Feel free to create issues or send an email to [email protected] if you have problems running the code.

Before running the code, you need to install the dependicies by pip install -r requirements.txt.

The configs for model architecture and training scheme is saved in config.yaml. You can overwrite some of the attributes by adding the --hparams flag when running a command. The general way to run a python script is

python $SRC$ --config $CONFIG$ --hparams $KEY1$=$VALUE1$,$KEY2$=$VALUE2$,...

See hparams.py for more details.

To prepare data

Before training, you need to binarize the data first. The raw wav files should be put in the hparams['raw_data_path']. The binarized data would be put in the hparams['binary_data_path'].

Specifically, for the VCTK corpus, the file structure should be like

.
|--data
    |--raw
        |--VCTK-Corpus
            |--wav48
                |--$WAVS
|--checkpoints
    |--wsrglow

where the model checkpoints are in checkpoints/wsrglow.

The command to binarize is

python binarizer.py --config config.yaml

To modify the architecture of the model

The current WSRGlow model in model.py is designed for x4 super-resolution and takes waveform, spectrogram and phase information as input.

To train

Run python train.py --config config.yaml on a GPU.

To infer

Change the code in infer.py to specify the checkpoint you want to load and the sample inputs you want to use for inference. Run python infer.py --config config.yaml on a GPU, modify the code for the correct path of checkpoints and wav files.

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Related tags

Overview

WSRGlow

To prepare data

To modify the architecture of the model

To train

To infer

Owner

Kexun Zhang

A PyTorch implementation of unsupervised SimCSE

Temporal Knowledge Graph Reasoning Triggered by Memories

MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

Node for thenewboston digital currency network.

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

mlpack: a scalable C++ machine learning library --

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Huawei Hackathon 2021 - Sweden (Stockholm)

Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

《Fst Lerning of Temporl Action Proposl vi Dense Boundry Genertor》(AAAI 2020)

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

Pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks."

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Python wrappers to the C++ library SymEngine, a fast C++ symbolic manipulation library.

[WACV 2020] Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints

Tensorflow implementation of MIRNet for Low-light image enhancement

ATAC: Adversarially Trained Actor Critic

Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.