原神抽卡记录数据集-Genshin Impact gacha data

Last update: Dec 27, 2022

Related tags

Text Data & NLP genshin-impact

Overview

提要

持续收集原神抽卡记录中

可以使用抽卡记录导出工具导出抽卡记录的json，将json文件发送至[email protected]，我会在清除个人信息后将文件提交到此处。以下两种导出工具任选其一即可。

一种抽卡记录导出工具 from sunfkny 使用方法演示视频

另一种electron版的抽卡记录导出工具 from lvlvl

目前数据集中有195917条抽卡记录

数据使用说明

你可以以个人身份自由的使用本项目数据用于抽卡机制研究，你可以自由的修改和发布我的分析代码（虽然我这代码还不如重新写一次）

但是一定不要将抽卡数据集发布整合到别的平台上，若如此，以后有人去使用多个来源的抽卡数据可能会遇到严重的数据重复问题。请让想要获得抽卡数据朋友来GitHub下载，或注明数据来自本项目。

在使用本数据集得出任何结论时，请自问过程是否严谨，结论是否可信。不应当发布显然不正确的抽卡模型或是不正确且会造成不良影响的模型，如造成不良影响，数据集整理者和提供数据的玩家不负任何责任。

通过一段时间的研究，我基本整理出了原神抽卡的所有机制：

原神抽卡全机制总结

分析抽卡机制的一些工具

数据格式说明

dataset_02文件夹中文件从0001开始顺序编号

每个文件夹内包含一个账号的抽卡记录

gacha100.csv 记录初行者推荐祈愿抽卡数据

gacha200.csv 记录常驻祈愿抽卡数据

gacha301.csv 记录角色活动祈愿数据

gacha302.csv 记录武器活动祈愿数据

csv文件内数据记录格式如下

抽卡时间	名称	类别	星级
YYYY-MM-DD HH:MM:SS	物品全名	角色/武器	3/4/5

分析工具说明

DataAnalysis.py用于分析csv抽卡文件，这段代码还在重写中，会非常的难用，仅供参考，运行后会输出参考统计量并画出分布图，分布图中理论值是我根据实际数据、部分游戏文件推理建立的概率增长模型。

DistributionMatrix.py用于在四星五星耦合的情况下分析设计模型的抽卡概率和分布，是计算抽卡模型的综合概率与期望的大杀器

原神抽卡记录数据集-Genshin Impact gacha data

Related tags

Overview

提要

数据使用说明

数据格式说明

推荐数据处理方式

分析工具说明

Owner

Simple, hackable offline speech to text - using the VOSK-API.

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

NLP-SentimentAnalysis - Coursera Course ( Duration : 5 weeks ) offered by DeepLearning.AI

GooAQ 🥑 : Google Answers to Google Questions!

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Pre-training with Extracted Gap-sentences for Abstractive SUmmarization Sequence-to-sequence models

Google AI 2018 BERT pytorch implementation

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Blackstone is a spaCy model and library for processing long-form, unstructured legal text

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time

GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

A toolkit for document-level event extraction, containing some SOTA model implementations

NLP topic mdel LDA - Gathered from New York Times website

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

Code examples for my Write Better Python Code series on YouTube.

TLA - Twitter Linguistic Analysis

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Machine translation models released by the Gourmet project