SimCSE在中文任务上的简单实验

Related tags

MiscellaneousSimCSE
Overview

SimCSE 中文测试

SimCSE在常见中文数据集上的测试,包含ATECBQLCQMCPAWSXSTS-B共5个任务。

介绍

文件

- utils.py  工具函数
- eval.py  评测主文件

评测

命令格式:

python eval.py [model_type] [pooling] [task_name] [dropout_rate]

使用例子:

python eval.py BERT cls ATEC 0.3

其中四个参数必须传入,含义分别如下:

- model_type: 模型,必须是['BERT', 'RoBERTa', 'WoBERT', 'RoFormer', 'BERT-large', 'RoBERTa-large', 'SimBERT', 'SimBERT-tiny', 'SimBERT-small']之一;
- pooling: 池化方式,必须是['first-last-avg', 'last-avg', 'cls', 'pooler']之一;
- task_name: 评测数据集,必须是['ATEC', 'BQ', 'LCQMC', 'PAWSX', 'STS-B']之一;
- dropout_rate: 浮点数,dropout的比例,如果为0则不dropout;

环境

测试环境:tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.5,如果在其他环境组合下报错,请根据错误信息自行调整代码。

下载

Google官方的两个BERT模型:

关于语义相似度数据集,可以从数据集对应的链接自行下载,也可以从作者提供的百度云链接下载。

其中senteval_cn目录是评测数据集汇总,senteval_cn.zip是senteval目录的打包,两者下其一就好。

相关

交流

QQ交流群:808623966,微信群请加机器人微信号spaces_ac_cn

Owner
苏剑林(Jianlin Su)
科学爱好者
苏剑林(Jianlin Su)
Generalise Prometheus metrics. takes out server specific, replaces variables and such.

Generalise Prometheus metrics. takes out server specific, replaces variables and such. makes it easier to copy from Prometheus console straight to Grafana.

ziv 5 Mar 28, 2022
Dev-meme - A repository that contains memes just for people like us

A repository that contains memes just for people like us. Coders are constantly

Padmashree Jha 4 Oct 31, 2022
This is friendlist update tools & old idz clon & follower idz clon etc

This is friendlist update tools & old idz clon & follower idz clon etc

MAHADI HASAN AFRIDI 1 Jan 15, 2022
A simple script that can watch a list of directories for change and does some action

plot_watcher A simple script that can watch a list of directories and does some action when a specific kind of change happens In its current implement

Charaf Errachidi 12 Sep 10, 2021
A free and open-source chess improvement app that combines the power of Lichess and Anki.

A free and open-source chess improvement app that combines the power of Lichess and Anki. Chessli Project Activity & Issue Tracking PyPI Build & Healt

93 Nov 23, 2022
Standard mutable string (character array) implementation for Python.

chararray A standard mutable character array implementation for Python.

Tushar Sadhwani 3 Dec 18, 2021
Hacking and Learning consistently for 100 days straight af.

#100DaysOfHacking Hacking and Learning consistently for 100 days straight af. [yes, no breaks except mental-break ones, Obviously.] This Repo is one s

FENIL SHAH 17 Sep 09, 2022
Cross-Encoder-with-Bi-Encoder를 활용한 WebPage 데모

Retrieval_Streamlit_Demo Cross-Encoder-with-Bi-Encoder를 활용한

5 Dec 29, 2021
Safe temperature monitor for baby's room. Made for Raspberry Pi Pico.

Baby Safe Temperature Monitor This project is meant to build a temperature safety monitor for a baby or small child's room. Studies have shown the ris

Jeff Geerling 72 Oct 09, 2022
Script Repository for the ICGM-CNRS FRANCE

Here you will find my Python Work repesitory for the ICGM institute - Montpellier - France.

CABOS Matthieu 1 Apr 13, 2022
fast_bss_eval is a fast implementation of the bss_eval metrics for the evaluation of blind source separation.

fast_bss_eval Do you have a zillion BSS audio files to process and it is taking days ? Is your simulation never ending ? Fear no more! fast_bss_eval i

Robin Scheibler 99 Dec 13, 2022
This is a Python program I wrote to simulate the solar system with 79 lines of code.

Solar System With Python This is a Python program I wrote to simulate the solar system with 79 lines of code. Required modules tkinter, math, time Why

Mehmet Aydoğmuş 1 Oct 26, 2021
Fastest python library for making asynchronous group requests.

FGrequests: Fastest Asynchronous Group Requests Installation Install using pip: pip install fgrequests Documentation Pretty easy to use. import fgrequ

Farid Chowdhury 14 Nov 22, 2022
In this project we will be using OpenCV to virtually drag a rectangle and drop it at a different location. It will be further used for Virtual Mouse.

Virtual Drag & Drog using OpenCV In this project we will be using OpenCV to virtually drag a rectangle and drop it at a different location. It will be

Hassan Shahzad 5 Sep 27, 2021
Check bookings for TUM libraries.

TUM Library Checker Only for educational purposes This repository contains a crawler to save bookings for TUM libraries in a CSV file. Sample data fro

Leon Blumenthal 3 Jan 27, 2022
Building an Investment Portfolio for Day Trade with Python

Montando um Portfólio de Investimentos para Day Trade com Python Instruções: Para reproduzir o projeto no Google Colab, faça o download do repositório

Paula Campigotto 9 Oct 26, 2021
A python tool that creates issues in your repos based on TODO comments in your code

Krypto A neat little sidekick python script to create issues on your repo based on comments left in the code on your behalf Convert todo comments in y

Alex Antoniou 4 Oct 26, 2021
Blender 3.0 Python - Open temporary areas in the Text Editor

PopDrawers When editing text in Blender, it can be handy to have areas like Info, Console, Outliner, etc visible on screen to help with scripting. How

SpectralVectors 7 Nov 16, 2022
AutoMetamon: Simple program to play Metamon automatically

AutoMetamon: Simple program to play Metamon automatically

Ngô Văn Tuấn 2 Sep 13, 2022
DownTime-Score is a Small project aimed to Monitor the performance and the availabillity of a variety of the Vital and Critical Moroccan Web Portals

DownTime-Score DownTime-Score is a Small project aimed to Monitor the performance and the availabillity of a variety of the Vital and Critical Morocca

adnane-tebbaa 5 Apr 30, 2022