Python Markov Chain chatbot running on Telegram

Overview

Hanasubot

Hanasubot (Japanese 話すボット, talking bot) is a Python chatbot running on Telegram. The bot is based on Markov Chains so it can learn your word instantly, unlike neural network chatbots which require training. It uses a modified version of markovify library for that purporse. However, the output may not make sense at all, though it can sometimes generate hilarious replies.

In theory, the bot can learn in any languages, but for some languages word segmentation is required. The bot currently supports Chinese and Japanese word segmentation, with pkuseg, CkipTagger and mecab. Language detection relies on pycld2.

Hanasubot has a permission system so you can easily stop the bot learning from naughty kids in your group, while still reply them. Users with admin right can erase lines from bot corpus as well.

The bot is designed for Chinese Telegram groups so there are a lot of messages written in Chinese. I18n will happen in future and any help is welcome.

Installation

Python 3.6+ is required.

VENV_PATH=/path/to/your/venv  # Change this
python3 -m venv $VENV_PATH
source $VENV_PATH/bin/activate

pip3 install -r requirements.txt

If you are using Python 3.6, dataclasses 0.8 is required as well:

pip3 install dataclasses==0.8

For Python 3.7 and up, dataclasses is included so no need to install it.

To use CkipTagger for Traditional Chinese tokenization, you have to download the model file (see CkipTagger readme for a detailed guide):

python3 -c "from ckiptagger import data_utils; data_utils.download_data_gdown('./')"

Then unzip to a folder named ckipdata, in the same directory as the Python scripts.

Optionally, you can initialize the user dict for pkuseg and CkipTagger, before start running the bot:

touch ./pkuseg_dict.txt
touch ./ckip_dict.json

Configuration

Copy config.example.py and fill it out. Please check the comments in config file.

cp config.example.py config.py

After that, simply start the bot:

python3 tgbot.py

Bot commands and usage

Simply reply to the bot and it will say some random words if you have collected enough corpus. The bot will also learn from your message instantly. Special commands are as follows.

Require root

  • /reload_config - Reload config file without restarting the bot. Some entries cannot be dynamically reloaded though, see config.example.py for details.

Require admin

  • /erase - Remove lines from corpus. (Non-admins can only erase lines sent by themselves.)
  • /userweight - Set user weight.
  • /ban - Set user right to -1.
  • /restrict - Set user right to 1.
  • /grantnormal - Set user right to 2.
  • /granttrusted -Set user right to 3.
  • /grantadmin - Set user right to 4. Admins are able to add/remove other admins with above commands. See also the user right levels section.

Require trusted

  • /addword_cn - Add a word into pkuseg user dictionary.
  • /addword_tw - Add a word into CkipTagger user dictionary.
  • /rmword_cn - Remove a word from pkuseg user dictionary.
  • /rmword_tw - Remove a word from CkipTagger user dictionary.

Other commands

  • /clddbg - Test language detection of some texts.
  • /cutdbg - Test tokenization of some texts.
  • /policy - See what data is collected by the bot and so on.
  • /reload - Claim your admin rights after you get Telegram group admin.
  • /source - See the source code.
  • /start - Start chatting, useful when you can't find the bot messages to reply.

Database

Initialize

CREATE TABLE IF NOT EXISTS chat(
    chat_id integer PRIMARY KEY,
    chat_tgid integer NOT NULL UNIQUE,
    chat_name text
);
CREATE TABLE IF NOT EXISTS user(
    user_id integer PRIMARY KEY,
    user_tgid integer NOT NULL UNIQUE,
    user_name text,
    user_right integer DEFAULT 2,
    user_weight real DEFAULT 1.0
);
CREATE TABLE IF NOT EXISTS corpus(
    corpus_id integer PRIMARY KEY,
    corpus_time integer,
    corpus_line text NOT NULL UNIQUE,
    corpus_raw integer REFERENCES raw,
    corpus_chat integer REFERENCES chat,
    corpus_user integer REFERENCES user,
    corpus_weight real DEFAULT 1.0
);
CREATE TABLE IF NOT EXISTS raw(
    raw_id integer PRIMARY KEY,
    raw_text text UNIQUE
);

User right levels

  • 5 - root.
  • 4 - admin, can change user rights (except root users), can erase a line from corpus, and can set user_weight and corpus_weight (WIP).
  • 3 - trusted user, can feed the bot via private messages, and can add words into dictionary (for tokenization purposes).
  • 2 - normal user.
  • 1 - restricted user, bot will not write their messages into database.
  • -1 - banned user, bot will not reply to their messages.

TODOs

  • Let admins set corpus_weight
  • Batch /erase

License

MIT

A Python Library to interface with LinkedIn API, OAuth and JSON responses

#Overview Here's another library based on the LinkedIn API, OAuth and JSON responses. Hope this documentation explains everything you need to get star

Mike Helmick 69 Dec 11, 2022
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T

ClearML 3.9k Jan 01, 2023
A telegram bot to interact with a Minecraft Server

telegram-mc-bot A telegram bot to interact with a Minecraft Server It has the following commands: /status - Returns the server status (Online/Offline)

KleynArt 1 Dec 09, 2021
A simple library for interacting with Amazon S3.

BucketStore is a very simple Amazon S3 client, written in Python. It aims to be much more straight-forward to use than boto3, and specializes only in

Jacobi Petrucciani 219 Oct 03, 2022
Minimal API for the COVID Booking System of the Offices at the UniPD Math Dep

Simple and easy to use python BOT for the COVID registration booking system of the math department @ unipd (torre archimede). This API creates an interface with the official website, with more useful

Guglielmo Camporese 4 Dec 24, 2021
Notion API Database Python Implementation

Python Notion Database Notion API Database Python Implementation created only by database from the official Notion API. Installing / Getting started p

minwook 78 Dec 19, 2022
Some 3Commas helper bots, AltRank, GalaxyScore, Watchlist, Auto-Compound

3Commas Cyber Bot Helpers A collection of 3Commas bot helpers I wrote. (collection will grow over time) Disclaimer THE SOFTWARE IS PROVIDED "AS IS", W

Ron Klinkien 176 Jan 02, 2023
Python wrapper to simplify calls to AncestryDNA API.

AncestryDNA API wrapper Ancestry exposes an undocumented REST API for its DNA features. This Python wrapper inventories the available calls, and expos

Matt 2 Jun 10, 2022
Twitter-bot - A Simple Twitter bot with python

twitterbot To use this bot, You will require API Key and Access Key. Signup at h

Bentil Shadrack 8 Nov 18, 2022
A virus/stealer made in py

python-virus A virus/stealer made in py. Features: Discord token stealer, Password stealer, Windows key stealer, Credit-card stealer, Image grab, Anti

SKYNETMARCI 5 Dec 12, 2022
Send OpenWeatherMap alerts (One Call API) to telegram users.

OpenWeatherMap Telegram Alert Send OpenWeatherMap alerts (One Call API) to telegram users. Installation Requirements: $ apt install python3-yaml pytho

Michael Hacker 1 Jun 04, 2022
A Bot To Find Telegram User ID Easily

Telegram ID Bot 🤖 A Bot To Find Telegram User ID Easily Made with Python3 (C) @BXBotz Copyright permission under MIT License License - https://githu

MuFaz-TG 6 Nov 21, 2022
Python SDK for accessing the Hanko Authentication API

Hanko Authentication SDK for Python This package is maintained by Hanko. Contents Introduction Documentation Installation Usage Prerequisites Create a

Hanko.io 3 Mar 08, 2022
Info & tools for reverse engineering the M6 smart fitness band

m6-reveng This repo contains information and tools for reverse engineering the $7 M6 smart fitness band. Hardware The SoC (system-on-a-chip) is a Teli

41 Dec 26, 2022
A python telegram bot to fetch the details of an ipadress with help of ip-api

ipfetcher A python(Pyrogram) oriented telegram bot to fetch the details of an ipadress developed by @riz4d with the API of https://ip-api.com Deployme

Mohamed Rizad 5 Mar 12, 2022
The Simple Google Colab Notebook to Download Files from Direct Link to Google Drive with custom name and bulk link support.

Direct Link to Google Drive (Advanced! 🔥 ) The Most Advanced yet Simple Google Colab Notebook to Download Files from Direct Link to Google Drive. 🆕

Dr.Caduceus 14 Jul 26, 2022
Wrapper for vk_api lib for faster bot buliding

Welcome to VKBotPod repository! Wrapper for vk_api lib for faster bot buliding Features Simple syntax Rich functionality Special thanks to movpushmov

NullPointerException 3 Jan 14, 2022
Spotify playlist anonymizer.

Spotify heavily personalizes auto-generated playlists like Song Radio based on the music you've listened to in the past. But sometimes you want to listen to Song Radio precisely to hear some fresh so

Jakob de Maeyer 9 Nov 27, 2022
This is a cryptocurrency trading bot that analyses Reddit sentiment and places trades on Binance based on reddit post and comment sentiment. If you like this project please consider donating via brave. Thanks.

This is a cryptocurrency trading bot that analyses Reddit sentiment and places trades on Binance based on reddit post and comment sentiment. The bot f

Andrei 157 Dec 15, 2022
A Python wrapper for the WooCommerce API.

WooCommerce API - Python Client A Python wrapper for the WooCommerce REST API. Easily interact with the WooCommerce REST API using this library. Insta

WooCommerce 171 Dec 25, 2022