This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Last update: Dec 04, 2022

Related tags

Text Data & NLP proteno

Overview

Proteno

This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems (https://arxiv.org/abs/2104.07777)

Security

See CONTRIBUTING for more information.

License

This project is released under CC-BY-NC-4.0 and other licenses:

English: CC-BY-SA
Spanish: CC-BY-SA
Tamil: CC-BY-NC-SA

Citation

If you use our data, please cite the following paper:

@inproceedings{tyagi-etal-2021-proteno,
    title = "Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems",
    author = "Tyagi, Shubhi  and
      Bonafonte, Antonio  and
      Lorenzo-Trueba, Jaime  and
      Latorre, Javier",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-industry.10",
    pages = "72--79",
}

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Related tags

Overview

Proteno

Security

License

Citation

Owner

A multi-voice TTS system trained with an emphasis on quality

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Multilingual word vectors in 78 languages

BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

Community and sentiment analysis based on tweets

A sentence aligner for comparable corpora

A curated list of efficient attention modules

AllenNLP integration for Shiba: Japanese CANINE model

Nateve compiler developed with python.

A website which allows you to play with the GPT-2 transformer

A BERT-based reverse-dictionary of Korean proverbs

Multi Task Vision and Language

Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET

Russian GPT3 models.

Tracking Progress in Natural Language Processing