GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

Overview

🎮 🎧 🚀 Generic Visual Translator 🚀 🎧 🎮

GVT is a generic translation tool for parts of text on the PC screen with Text to Speech functionality. I wanted to create it because the existing tools that I experimented with did not satisfy me in ease-to-use experience and configuration. Personally I used it with Lost Ark (example included generated by 2k monitor) to translate simple dialogues of quests in Italian.

ko-fi

📝 Requirements

Tested Operating Systems : Windows 10/11 Python Version: 3.9.6

  • Easynmt
  • OpenCV2
  • Easyocr
  • Numpy
  • Deepl (Unofficial API)
  • Pyttsx3
  • Pywin32
  • WXWidgets
  • Pygame
  • Keyboard

The requirements.txt file has been created with the versions currently installed on my pc, but it is not excluded that GVT could work also with newer or older versions of the same libraries

Requirements installation command pip install -r requirements.txt

💪 How it works

GVT simply translates a user-defined region of the screen and then recites it using Windows 10/11 TTS (Not tested on Windows 7) showing the translated text instead of the one on the screen.

Before using it, you need to configure the config.yaml file in the same folder.

Then you can run GVT using run.bat or with the command python main.py.

## 👀 File description config.yaml.

Variable Name Type of variable Description Recommended
game_name string between " Application Name
source_language Acronym that corresponds to the Application language (ex. en,de,ch,jp) Language of the application.
target_language Acronym that corresponds to the chosen language (ex. en,de,ch,jp) Language in which to translate.
translation_method deepl | opus Translation Engine. Deepl will use unofficial API. deepl
translation_internal_method offline | online Used only when you select internal in the translation_method variable. offline: is using the model downloaded in the models\opus-mt folder. You can download the entire model here : https://huggingface.co/Helsinki-NLP online: it download the model you need automatically.
gpu_enabled True | False With True and a supported GPU the read of the text will be really fast. True
time_between_captures integer Time that pass before GVT check a new element on the screen. 1
skip_key string between " | "None" If the text can be sent forward, once read, with a key, GVT can send it forward automatically by telling it which key to press. If set to None it will not do anything.
show_text True | False If set to True, an overlay will be shown on the application text, containing the translated text.
time_to_wait_for_word float If tss_enabled is set to False and show_text is set to True GVT will use this parameter to figure out how long to show the overlay text. If tss_enabled is set to True this parameter will be ignored and the overlay will last as long as it takes to play the audio of the text. 0.3
tts_enabled True | False If enabled, GVT will use windows text to Speech the translated phrase.
tts_voice_number integer Use voice_list.py to list all the voices on your system and to see which number corresponds to the one you want to choose.
main_region It contains the coordinates of the region of the screen where the text to be translated will appear. Use GetCoords.py to make your job easier.
main_region > X integer Starting point of the region on the X axis.
main_region > Y integer Starting point of the region on the Y axis.
main_region > extensionOfX integer Number of pixels required to reach the end point of the frame on the X axis.
main_region > extensionOfY integer Number of pixels required to reach the end point of the frame on the Y axis.
activator_region It contains the coordinates where GVT will look for the text activation image to be translated. Once found, GVT will proceed with the translation. Once it disappears it will return to idle state.
activator_region > name string | "None" Name of the image that you will cut from a screenshot of your screen and that identifies the appearance of a text to be translated in the application.It need to be placed in the activators folder
activator_region > X integer Starting point of the region on the X axis.
activator_region > Y integer Starting point of the region on the Y axis.
activator_region > extensionOfX integer Number of pixels required to reach the end point of the frame on the X axis.
activator_region > extensionOfY integer Number of pixels required to reach the end point of the frame on the Y axis.

🚀 Getting started

This is an example based on the LostArk video game

  • Clone this repository on your pc or download the folder and enter in it
  • Launch LostArk and reach a dialogue scene
  • Run runCoordHelper.bat or the command python GetCoords.py
  • Press Z on the upper left point of the text box
  • Press Z on the lower right point of the text box
  • Copy the coordinates from the console instead of the empty fields in the config.yaml file under the main_region and close the console
  • Find the dot or icon that appears whenever the text to be translated also appears, in the case of LostArk it is the Leave button at the bottom right
  • Press the Shift + Win + S buttons on Windows 10 or 11 and select this image and save it later in the ** activators ** folder with a recognizable name
  • Run runCoordHelper.bat again or the command` python GetCoords.py
  • Use the same method as above to get the coordinates of a not too narrow box surrounding the ** activator ** in-game image
  • Copy the coordinates from the console and paste it instead of the empty fields in the config.yaml file under the activator_region and close the console
  • Set the source_language with the acronym of the language you want to translate from, and the target_language for the language you want to translate the game into (use https://github.com/ptrstn/deepl-translate for the reference table and languages supported by deepl or go here https://huggingface.co/Helsinki-NLP for opus models)
  • Set the dialog progress key if desired, otherwise leave it at None. Note: Leave to None if your game have a heavy anti-cheat system that not allow anything except you to press the keys of your keyboard
  • Set show_text and tts_enabled according to what you want enabled/disabled
  • If you have set tts_enabled to True, run runVoiceList.bat or python voice_list.py to find out the number matched to the voices installed in your Windows distribution (is the one in the square parentheses) and set the variable tts_voice_number to the desired number.

Here is an example of the complete file 📋

game_name:  Lost_Ark
source_language: en
target_language: it
translation_method: deepl
translation_internal_method: offline
gpu_enabled: True
time_between_captures: 1
skip_key: "g"
show_text: False
time_to_wait_for_word: 0.3
tts_enabled: True
tts_voice_number: 0

main_region: 
  X: 567
  Y: 1304
  extensionOfX: 2068
  extensionOfY: 1439
activator_region:
  name: "lost_ark.png"
  X: 2
  Y: 1308
  extensionOfX: 2559
  extensionOfY: 1439

  • Execute run.bat

💭 To Do

  • Add the capability to define more regions and activator at once
  • Add the capability to support multiple game just chosing it from a menu
Owner
Nuked
Nuked
189 Jan 02, 2023
This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -

1 Mar 14, 2022
This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

Advent-of-cyber-2019-writeup This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe https://tryhackme.com/shivam007/badges/c

shivam danawale 5 Jul 17, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022
Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources (NAACL-2021).

Unifying Cross-Lingual Semantic Role Labeling with Heterogeneous Linguistic Resources Description This is the repository for the paper Unifying Cross-

Sapienza NLP group 16 Sep 09, 2022
Repository of the Code to Chatbots, developed in Python

Description In this repository you will find the Code to my Chatbots, developed in Python. I'll explain the structure of this Repository later. Requir

Li-am K. 0 Oct 25, 2022
A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Crosslingual Coreference Coreference is amazing but the data required for training a model is very scarce. In our case, the available training for non

Pandora Intelligence 71 Jan 04, 2023
This is the Alpha of Nutte language, she is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda

nutte-language This is the Alpha of Nutte language, it is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda My language was

catdochrome 2 Dec 18, 2021
The Classical Language Toolkit

Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt

Classical Language Toolkit 754 Jan 09, 2023
Facilitating the design, comparison and sharing of deep text matching models.

MatchZoo Facilitating the design, comparison and sharing of deep text matching models. MatchZoo 是一个通用的文本匹配工具包,它旨在方便大家快速的实现、比较、以及分享最新的深度文本匹配模型。 🔥 News

Neural Text Matching Community 3.7k Jan 02, 2023
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets

Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets What is LASSL • How to Use What is LASSL LASSL은 LAnguage Semi-Super

LASSL: LAnguage Self-Supervised Learning 116 Dec 27, 2022
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n

Victor Dibia 220 Dec 11, 2022
Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

Zhuosheng Zhang 3 Jun 13, 2022
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Deepangshi 1 Apr 03, 2022
A 30000+ Chinese MRC dataset - Delta Reading Comprehension Dataset

Delta Reading Comprehension Dataset 台達閱讀理解資料集 Delta Reading Comprehension Dataset (DRCD) 屬於通用領域繁體中文機器閱讀理解資料集。 本資料集期望成為適用於遷移學習之標準中文閱讀理解資料集。 本資料集從2,108篇

272 Dec 15, 2022
Weird Sort-and-Compress Thing

Weird Sort-and-Compress Thing A weird integer sorting + compression algorithm inspired by a conversation with Luthingx (it probably already exists by

Douglas 1 Jan 03, 2022
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Phil Wang 5k Jan 02, 2023
A library for end-to-end learning of embedding index and retrieval model

Poeem Poeem is a library for efficient approximate nearest neighbor (ANN) search, which has been widely adopted in industrial recommendation, advertis

54 Dec 21, 2022
Code Generation using a large neural network called GPT-J

CodeGenX is a Code Generation system powered by Artificial Intelligence! It is delivered to you in the form of a Visual Studio Code Extension and is Free and Open-source!

DeepGenX 389 Dec 31, 2022
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

Moment-DETR QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries Jie Lei, Tamara L. Berg, Mohit Bansal For dataset de

Jie Lei 雷杰 133 Dec 22, 2022