An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Last update: Jun 16, 2022

Related tags

Computer Vision AutoVC

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

This is an unofficial implementation of AutoVC based on the official one.

The repository is still under construction, so some details may be missing or incomplete.

Preprocessing

python preprocess.py <data_path> <save_path> <encoder_path> [--seg_len seg] [--n_workers workers]

Training

python train.py <config> <data_path> <save_path> [--n_steps steps] [--save_steps save] [--log_steps log] [--batch_size batch] [--seg_len seg]

Reference

Please cite the paper if you find it useful.

@InProceedings{pmlr-v97-qian19c,
  title = {{A}uto{VC}: Zero-Shot Voice Style Transfer with Only Autoencoder Loss},
  author = {Qian, Kaizhi and Zhang, Yang and Chang, Shiyu and Yang, Xuesong and Hasegawa-Johnson, Mark},
  pages = {5210--5219},
  year = {2019},
  editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov},
  volume = {97},
  series = {Proceedings of Machine Learning Research},
  address = {Long Beach, California, USA},
  month = {09--15 Jun},
  publisher = {PMLR},
  pdf = {http://proceedings.mlr.press/v97/qian19c/qian19c.pdf},
  url = {http://proceedings.mlr.press/v97/qian19c.html}
}

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Related tags

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Preprocessing

Training

Reference

Owner

Chien-yu Huang

a Deep Learning Framework for Text

OCR, Object Detection, Number Plate, Real Time

Pre-Recognize Library - library with algorithms for improving OCR quality.

Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

OCR engine for all the languages

Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

Creating a virtual tv using opencv in python3.

Ocular is a state-of-the-art historical OCR system.

Primary QPDF source code and documentation

Apply different text recognition services to images of handwritten documents.

Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

This is a GUI program which consist of 4 OpenCV projects

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

Rotational region detection based on Faster-RCNN.

This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images.

Demo processor to illustrate OCR-D Python API

2 telegram-bots: for image recognition and for text generation