LAnguage Model Analysis

Related tags

Deep LearningLAMA
Overview

LAMA: LAnguage Model Analysis

LAMA

LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models.

The dataset for the LAMA probe is available at https://dl.fbaipublicfiles.com/LAMA/data.zip

LAMA contains a set of connectors to pretrained language models.
LAMA exposes a transparent and unique interface to use:

  • Transformer-XL (Dai et al., 2019)
  • BERT (Devlin et al., 2018)
  • ELMo (Peters et al., 2018)
  • GPT (Radford et al., 2018)
  • RoBERTa (Liu et al., 2019)

Actually, LAMA is also a beautiful animal.

Reference:

The LAMA probe is described in the following papers:

@inproceedings{petroni2019language,
  title={Language Models as Knowledge Bases?},
  author={F. Petroni, T. Rockt{\"{a}}schel, A. H. Miller, P. Lewis, A. Bakhtin, Y. Wu and S. Riedel},
  booktitle={In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019},
  year={2019}
}

@inproceedings{petroni2020how,
  title={How Context Affects Language Models' Factual Predictions},
  author={Fabio Petroni and Patrick Lewis and Aleksandra Piktus and Tim Rockt{\"a}schel and Yuxiang Wu and Alexander H. Miller and Sebastian Riedel},
  booktitle={Automated Knowledge Base Construction},
  year={2020},
  url={https://openreview.net/forum?id=025X0zPfn}
}

The LAMA probe

To reproduce our results:

1. Create conda environment and install requirements

(optional) It might be a good idea to use a separate conda environment. It can be created by running:

conda create -n lama37 -y python=3.7 && conda activate lama37
pip install -r requirements.txt

2. Download the data

wget https://dl.fbaipublicfiles.com/LAMA/data.zip
unzip data.zip
rm data.zip

3. Download the models

DISCLAIMER: ~55 GB on disk

Install spacy model

python3 -m spacy download en

Download the models

chmod +x download_models.sh
./download_models.sh

The script will create and populate a pre-trained_language_models folder. If you are interested in a particular model please edit the script.

4. Run the experiments

python scripts/run_experiments.py

results will be logged in output/ and last_results.csv.

Other versions of LAMA

LAMA-UHN

This repository also provides a script (scripts/create_lama_uhn.py) to create the data used in (Poerner et al., 2019).

Negated-LAMA

This repository also gives the option to evalute how pretrained language models handle negated probes (Kassner et al., 2019), set the flag use_negated_probes in scripts/run_experiments.py. Also, you should use this version of the LAMA probe https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz

What else can you do with LAMA?

1. Encode a list of sentences

and use the vectors in your downstream task!

pip install -e git+https://github.com/facebookresearch/LAMA#egg=LAMA
import argparse
from lama.build_encoded_dataset import encode, load_encoded_dataset

PARAMETERS= {
        "lm": "bert",
        "bert_model_name": "bert-large-cased",
        "bert_model_dir":
        "pre-trained_language_models/bert/cased_L-24_H-1024_A-16",
        "bert_vocab_name": "vocab.txt",
        "batch_size": 32
        }

args = argparse.Namespace(**PARAMETERS)

sentences = [
        ["The cat is on the table ."],  # single-sentence instance
        ["The dog is sleeping on the sofa .", "He makes happy noises ."],  # two-sentence
        ]

encoded_dataset = encode(args, sentences)
print("Embedding shape: %s" % str(encoded_dataset[0].embedding.shape))
print("Tokens: %r" % encoded_dataset[0].tokens)

# save on disk the encoded dataset
encoded_dataset.save("test.pkl")

# load from disk the encoded dataset
new_encoded_dataset = load_encoded_dataset("test.pkl")
print("Embedding shape: %s" % str(new_encoded_dataset[0].embedding.shape))
print("Tokens: %r" % new_encoded_dataset[0].tokens)

2. Fill a sentence with a gap.

You should use the symbol [MASK] to specify the gap. Only single-token gap supported - i.e., a single [MASK].

python lama/eval_generation.py  \
--lm "bert"  \
--t "The cat is on the [MASK]."

cat_on_the_phone

cat_on_the_phone

source: https://commons.wikimedia.org/wiki/File:Bluebell_on_the_phone.jpg

Note that you could use this functionality to answer cloze-style questions, such as:

python lama/eval_generation.py  \
--lm "bert"  \
--t "The theory of relativity was developed by [MASK] ."

Install LAMA with pip

Clone the repo

git clone [email protected]:facebookresearch/LAMA.git && cd LAMA

Install as an editable package:

pip install --editable .

If you get an error in mac os x, please try running this instead

CFLAGS="-Wno-deprecated-declarations -std=c++11 -stdlib=libc++" pip install --editable .

Language Model(s) options

Option to indicate which language model(s) to use:

  • --language-models/--lm : comma separated list of language models (REQUIRED)

BERT

BERT pretrained models can be loaded both: (i) passing the name of the model and using huggingface cached versions or (ii) passing the folder containing the vocabulary and the PyTorch pretrained model (look at convert_tf_checkpoint_to_pytorch in here to convert the TensorFlow model to PyTorch).

  • --bert-model-dir/--bmd : directory that contains the BERT pre-trained model and the vocabulary
  • --bert-model-name/--bmn : name of the huggingface cached versions of the BERT pre-trained model (default = 'bert-base-cased')
  • --bert-vocab-name/--bvn : name of vocabulary used to pre-train the BERT model (default = 'vocab.txt')

RoBERTa

  • --roberta-model-dir/--rmd : directory that contains the RoBERTa pre-trained model and the vocabulary (REQUIRED)
  • --roberta-model-name/--rmn : name of the RoBERTa pre-trained model (default = 'model.pt')
  • --roberta-vocab-name/--rvn : name of vocabulary used to pre-train the RoBERTa model (default = 'dict.txt')

ELMo

  • --elmo-model-dir/--emd : directory that contains the ELMo pre-trained model and the vocabulary (REQUIRED)
  • --elmo-model-name/--emn : name of the ELMo pre-trained model (default = 'elmo_2x4096_512_2048cnn_2xhighway')
  • --elmo-vocab-name/--evn : name of vocabulary used to pre-train the ELMo model (default = 'vocab-2016-09-10.txt')

Transformer-XL

  • --transformerxl-model-dir/--tmd : directory that contains the pre-trained model and the vocabulary (REQUIRED)
  • --transformerxl-model-name/--tmn : name of the pre-trained model (default = 'transfo-xl-wt103')

GPT

  • --gpt-model-dir/--gmd : directory that contains the gpt pre-trained model and the vocabulary (REQUIRED)
  • --gpt-model-name/--gmn : name of the gpt pre-trained model (default = 'openai-gpt')

Evaluate Language Model(s) Generation

options:

  • --text/--t : text to compute the generation for
  • --i : interactive mode
    one of the two is required

example considering both BERT and ELMo:

python lama/eval_generation.py \
--lm "bert,elmo" \
--bmd "pre-trained_language_models/bert/cased_L-24_H-1024_A-16/" \
--emd "pre-trained_language_models/elmo/original/" \
--t "The cat is on the [MASK]."

example considering only BERT with the default pre-trained model, in an interactive fashion:

python lamas/eval_generation.py  \
--lm "bert"  \
--i

Get Contextual Embeddings

python lama/get_contextual_embeddings.py \
--lm "bert,elmo" \
--bmn bert-base-cased \
--emd "pre-trained_language_models/elmo/original/"

Unified vocabulary

The intersection of the vocabularies for all considered models

Troubleshooting

If the module cannot be found, preface the python command with PYTHONPATH=.

If the experiments fail on GPU memory allocation, try reducing batch size.

Acknowledgements

Other References

  • (Kassner et al., 2019) Nora Kassner, Hinrich Schütze. Negated LAMA: Birds cannot fly. arXiv preprint arXiv:1911.03343, 2019.

  • (Poerner et al., 2019) Nina Poerner, Ulli Waltinger, and Hinrich Schütze. BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA. arXiv preprint arXiv:1911.03681, 2019.

  • (Dai et al., 2019) Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc V. Le, and Ruslan Salakhutdi. Transformer-xl: Attentive language models beyond a fixed-length context. CoRR, abs/1901.02860.

  • (Peters et al., 2018) Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. NAACL-HLT 2018

  • (Devlin et al., 2018) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.

  • (Radford et al., 2018) Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.

  • (Liu et al., 2019) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.

Licence

LAMA is licensed under the CC-BY-NC 4.0 license. The text of the license can be found here.

Owner
Meta Research
Meta Research
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti

Zhuang AI Group 77 Dec 26, 2022
Refactoring dalle-pytorch and taming-transformers for TPU VM

Text-to-Image Translation (DALL-E) for TPU in Pytorch Refactoring Taming Transformers and DALLE-pytorch for TPU VM with Pytorch Lightning Requirements

Kim, Taehoon 61 Nov 07, 2022
Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021

Frequency Bias of Generative Models Generator Testbed Discriminator Testbed This repository contains official code for the paper On the Frequency Bias

35 Nov 01, 2022
Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

Dictionary Learning for Clustering on Hyperspectral Images Overview Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionari

Joshua Bruton 6 Oct 25, 2022
The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

Wentao Xu 7 Nov 13, 2022
Implementing DeepMind's Fast Reinforcement Learning paper

Fast Reinforcement Learning This is a repo where I implement the algorithms in the paper, Fast reinforcement learning with generalized policy updates.

Marcus Chiam 6 Nov 28, 2022
Rocket-recycling with Reinforcement Learning

Rocket-recycling with Reinforcement Learning Developed by: Zhengxia Zou I have long been fascinated by the recovery process of SpaceX rockets. In this

Zhengxia Zou 202 Jan 03, 2023
PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Łukasz Zalewski 2.1k Jan 09, 2023
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection Introduction This repository includes codes and models of "Effect of De

Amir Abbasi 5 Sep 05, 2022
Implementation for Simple Spectral Graph Convolution in ICLR 2021

Simple Spectral Graph Convolutional Overview This repo contains an example implementation of the Simple Spectral Graph Convolutional (S^2GC) model. Th

allenhaozhu 64 Dec 31, 2022
A state-of-the-art semi-supervised method for image recognition

Mean teachers are better role models Paper ---- NIPS 2017 poster ---- NIPS 2017 spotlight slides ---- Blog post By Antti Tarvainen, Harri Valpola (The

Curious AI 1.4k Jan 06, 2023
BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression. Not an official Google product. Me

Google Research 27 Dec 12, 2022
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

Asha Anoosheh 139 Dec 20, 2022
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

Dirk Neuhäuser 6 Dec 08, 2022
Pytorch implementation of YOLOX、PPYOLO、PPYOLOv2、FCOS an so on.

简体中文 | English miemiedetection 概述 miemiedetection是女装大佬咩酱基于YOLOX进行二次开发的个人检测库(使用的深度学习框架为pytorch),支持Windows、Linux系统,以女装大佬咩酱的名字命名。miemiedetection是一个不需要安装的

248 Jan 02, 2023
REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

Zi Jian Yew 108 Dec 17, 2022
AITUS - An atomatic notr maker for CYTUS

AITUS an automatic note maker for CYTUS. 利用AI根据指定乐曲生成CYTUS游戏谱面。 效果展示:https://www

GradiusTwinbee 6 Feb 24, 2022