The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Last update: Nov 21, 2022

Overview

Language Models are Few-shot Multilingual Learners

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}

Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Related tags

Overview

Language Models are Few-shot Multilingual Learners

Paper

Setup Environment

GPU Machine

GPU Machine for Running GPT-J 6B Model

How to run

Zero-shot Cross-task

Finetune

Owner

Genta Indra Winata

Code-autocomplete, a code completion plugin for Python

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

LeBenchmark: a reproducible framework for assessing SSL from speech

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

neural network based speaker embedder

NVDA, the free and open source Screen Reader for Microsoft Windows

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)

An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

Open-source offline translation library written in Python. Uses OpenNMT for translations

translate using your voice

Implementation of legal QA system based on SentenceKoBART

Built for cleaning purposes in military institutions

A complete NLP guideline for enthusiasts

BERT, LDA, and TFIDF based keyword extraction in Python

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Chinese Pre-Trained Language Models (CPM-LM) Version-I

Natural Language Processing library built with AllenNLP 🌲🌱

Just a Basic like Language for Zeno INC

Fast, general, and tested differentiable structured prediction in PyTorch