Revisiting Self-Training for Few-Shot Learning of Language Model.

Last update: Nov 19, 2022

Related tags

Overview

SFLM

This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few-shot learning of language model.

Requirements

To run our code, please install all the dependency packages by using the following command:

pip install -r requirements.txt

Preprocess

The original data can be found from LM-BFF. To generate data for the few-shot experiments, please run the below command:

python tools/generate_data.py

The original data shall be in ./data/original, and the sampled data will be in ./data/few-shot/$K-$MU-$SEED. Please refer to ./tools/generate_data.py for more options.

Train

Our code can be run as the below example:

python3 run.py \
  --task_name SST-2 \
  --data_dir data/few-shot/SST-2/16-4-100 \
  --do_train \
  --do_eval \
  --do_predict \
  --evaluate_during_training \
  --model_name_or_path roberta-base \
  --few_shot_type prompt-demo \
  --num_k 16 \
  --max_seq_length 256 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 16 \
  --gradient_accumulation_steps 4 \
  --learning_rate 1e-5 \
  --max_steps 1000 \
  --logging_steps 100 \
  --eval_steps 100 \
  --num_train_epochs 0 \
  --output_dir result/SST-2-16-4-100 \
  --save_logit_dir result/SST-2-16-4-100 \
  --seed 100 \
  --template "*cls**sent_0*_It_was*mask*.*sep+*" \
  --mapping "{'0':'terrible','1':'great'}" \
  --num_sample 16 \
  --threshold 0.95 \
  --lam1 0.5 \
  --lam2 0.1

Most arguments are the same as LM-BFF, and the same manual prompts are used in our experiments. We list additional arguments used in SFLM:

threshold: The threshold used to filter out low-confidence samples for self-training loss
lam1: The weight of self-training loss
lam2: The weight of self-supervised loss

Citation

Please cite our paper if you use SFLM in your work:

@inproceedings{chen2021revisit,        
    title={Revisiting Self-Training for Few-Shot Learning of Language Model},         
    author={Chen, Yiming and Zhang, Yan and Zhang, Chen and Lee, Grandee and Cheng, Ran and Li, Haizhou},         
    booktitle={EMNLP},        
    year={2021},
}

Acknowledgements

Code is implemented based on LM-BFF. We would like to thank the authors of LM-BFF for making their code public.

Revisiting Self-Training for Few-Shot Learning of Language Model.

Related tags

Overview

SFLM

Requirements

Preprocess

Train

Citation

Acknowledgements

Owner

Code for database and frontend of webpage for Neural Fields in Visual Computing and Beyond.

A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

机器学习、深度学习、自然语言处理等人工智能基础知识总结。

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

Vpw analyzer - A visual J1850 VPW analyzer written in Python

UnpNet - Rethinking 3-D LiDAR Point Cloud Segmentation(IEEE TNNLS)

Asymmetric metric learning for knowledge transfer

Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech

Official implementation of "Generating 3D Molecules for Target Protein Binding"

Unifying Global-Local Representations in Salient Object Detection with Transformer

A new data augmentation method for extreme lighting conditions.

Constraint-based geometry sketcher for blender

Pytorch for Segmentation

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

CoRe: Contrastive Recurrent State-Space Models