Python SDK for working with Voicegain Speech-to-Text

Last update: Dec 14, 2022

Overview

Voicegain Speech-to-Text Python SDK

Python SDK for the Voicegain Speech-to-Text API.

This API allows for large vocabulary speech-to-text transcription as well as grammar-based speech recognition. Both real-time and offline use cases are supported.

You can see the core Voicegain API documentation here.

The complete documentation for the API covered by this SDK is available here - this link requires an account on the Voicegain portal - see below for how to sign up.

Requirements

In order to use this API you need account with Voicegain. You can create an account by signing up on Voicegain Portal. No credit card required to sign up.

You can see pricing here - basically, it is 1 cent a minute for off-line and 1.25 cents a minute for real-time. There is a Free Tier of 600 minutes that renews each month.

Installation

From PyPI directly:

pip install voicegain-speech

Examples

sync_transcribe example:

configuration:

" configuration = Configuration() configuration.access_token = JWT api_client = ApiClient(configuration=configuration) ">

from voicegain_speech import ApiClient
from voicegain_speech import Configuration
from voicegain_speech import TranscribeApi
import base64


# configure your JWT token
JWT = "Your 
   
    "
   

configuration = Configuration()
configuration.access_token = JWT

api_client = ApiClient(configuration=configuration)

transcribe local file:

transcribe_api = TranscribeApi(api_client)
file_path = "Your local file path"

with open(file_path, "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode()

response = transcribe_api.asr_transcribe_post(
    sync_transcription_request={
        "audio": {
            "source": {
                "inline": {
                    "data": audio_base64
                }
            }
        }
    }
)

alternatives = response.result.alternatives
if alternatives:
    local_result = alternatives[0].utterance
    print("result from file: ", local_result)

else:
    local_result = None
    print("no transcription")

More examples can be found in examples folder on our GitHub

Learn more about Voicegain Platform at www.voicegain.ai

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

11 Nov 17, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

easySpeech easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't

14 May 24, 2022

Text to speech converter with GUI made in Python.

Python SDK for working with Voicegain Speech-to-Text

Related tags

Overview

Voicegain Speech-to-Text Python SDK

Requirements

Installation

Examples

You might also like...

Speech Recognition for Uyghur using Speech transformer

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Text to speech converter with GUI made in Python.

A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

This is a really simple text-to-speech app made with python and tkinter.

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Releases(1.73.0)

1.73.0(Jan 6, 2023)

1.72.0(Dec 15, 2022)

1.71.1(Dec 9, 2022)

1.71.0(Dec 8, 2022)

1.70.2(Nov 23, 2022)

1.70.1(Nov 22, 2022)

1.70.0(Nov 22, 2022)

1.69.0(Nov 17, 2022)

1.68.1(Nov 11, 2022)

1.68.0(Oct 28, 2022)

1.67.0(Oct 25, 2022)

1.66.1(Oct 21, 2022)

1.66.0(Oct 18, 2022)

1.65.0(Sep 27, 2022)

1.64.1(Sep 19, 2022)

1.64.0(Sep 15, 2022)

1.63.0(Sep 7, 2022)

1.62.1(Aug 30, 2022)

1.62.0(Aug 26, 2022)

1.61.0(Aug 18, 2022)

1.60.4(Aug 11, 2022)

1.60.3(Jul 6, 2022)

1.60.2(Jun 30, 2022)

1.60.1(Jun 22, 2022)

1.60.0(Jun 17, 2022)

1.59.2(Jun 15, 2022)

1.59.1(Jun 9, 2022)

1.59.0(Jun 1, 2022)

1.58.1(May 24, 2022)

1.58.0(May 24, 2022)

Owner

Voicegain

An open collection of annotated voices in Japanese language

Official PyTorch implementation of SegFormer

Must-read papers on improving efficiency for pre-trained language models.

[AAAI 21] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Trained T5 and T5-large model for creating keywords from text

Topic Inference with Zeroshot models

This is a project of data parallel that running on NLP tasks.

Python port of Google's libphonenumber

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

PyTorch original implementation of Cross-lingual Language Model Pretraining.

Generating new names based on trends in data using GPT2 (Transformer network)

TTS is a library for advanced Text-to-Speech generation.

DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Club chatbot

Meta learning algorithms to train cross-lingual NLI (multi-task) models

Language-Agnostic SEntence Representations

Translation to python of Chris Sims' optimization function