Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Last update: Dec 20, 2022

Related tags

Deep Learning StrengthNet

Overview

StrengthNet

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

https://arxiv.org/abs/2110.03156

Dependency

Ubuntu 18.04.5 LTS

GPU: Quadro RTX 6000
Driver version: 450.80.02
CUDA version: 11.0

Python 3.5

tensorflow-gpu 2.0.0b1 (cudnn=7.6.0)
scipy
pandas
matplotlib
librosa

Environment set-up

For example,

conda create -n strengthnet python=3.5
conda activate strengthnet
pip install -r requirements.txt
conda install cudnn=7.6.0

Usage

Run python utils.py to extract .wav to .h5;
Run python train.py to train a CNN-BLSTM based StrengthNet;

Evaluating new samples

Put the waveforms you wish to evaluate in a folder. For example, / /
Run python test.py --rootdir / /

This script will evaluate all the .wav files in / /, and write the results to / / /StrengthNet_result_raw.txt.

By default, the output/strengthnet.h5 pretrained model is used.

Citation

If you find this work useful in your research, please consider citing:

@misc{liu2021strengthnet,
      title={StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis}, 
      author={Rui Liu and Berrak Sisman and Haizhou Li},
      year={2021},
      eprint={2110.03156},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Resources

The ESD corpus is released by the HLT lab, NUS, Singapore.

The strength scores for the English samples of the ESD corpus are available here.

Acknowledgements:

MOSNet: https://github.com/lochenchou/MOSNet

Relative Attributes: Relative Attributes

License

This work is released under MIT License (see LICENSE file for details).

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Related tags

Overview

StrengthNet

Dependency

Environment set-up

Usage

Evaluating new samples

Citation

Resources

Acknowledgements:

License

Owner

RuiLiu

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

An end-to-end implementation of intent prediction with Metaflow and other cool tools

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

Rendering color and depth images for ShapeNet models.

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)

Implementation of paper "Graph Condensation for Graph Neural Networks"

The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Simple machine learning library / 簡單易用的機器學習套件

Learn other languages using artificial intelligence with python.

Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Deep Learning for Computer Vision final project

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Related tags

Overview

StrengthNet

Dependency

Environment set-up

Usage

Evaluating new samples

Citation

Resources

Acknowledgements:

License

Owner

RuiLiu

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

An end-to-end implementation of intent prediction with Metaflow and other cool tools

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

Rendering color and depth images for ShapeNet models.

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)

Implementation of paper "Graph Condensation for Graph Neural Networks"

The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Simple machine learning library / 簡單易用的機器學習套件

Learn other languages ​​using artificial intelligence with python.

Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Deep Learning for Computer Vision final project

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

Learn other languages using artificial intelligence with python.