Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Last update: Dec 07, 2022

Related tags

Overview

Layerwise Anomaly

This repository contains the source code and data for our ACL 2021 paper: "How is BERT surprised? Layerwise detection of linguistic anomalies" by Bai Li, Zining Zhu, Guillaume Thomas, Yang Xu, and Frank Rudzicz.

Citation

If you use our work in your research, please cite:

Li, B., Zhu, Z., Thomas, G., Xu, Y., and Rudzicz, F. (2021) How is BERT surprised? Layerwise detection of linguistic anomalies. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL).

@inproceedings{li2021layerwise,
  author = "Li, Bai and Zhu, Zining and Thomas, Guillaume and Xu, Yang and Rudzicz, Frank",
  title = "How is BERT surprised? Layerwise detection of linguistic anomalies",
  booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)",
  publisher = "Association for Computational Linguistics",
  year = "2021",
}

Dependencies

The project was developed with the following library versions. Running with other versions may crash or produce incorrect results.

Python 3.7.5
CUDA Version: 11.0
torch==1.7.1
transformers==4.5.1
numpy==1.19.0
pandas==0.25.3
scikit-learn==0.22

Setup Instructions

Clone this repo: git clone https://github.com/SPOClab-ca/layerwise-anomaly
Download BNC Baby (4m word sample) from this link and extract into data/bnc/
Run BNC preprocessing script: python scripts/process_bnc.py --bnc_dir=data/bnc/download/Texts --to=data/bnc.pkl
Clone BLiMP repo: cd data && git clone https://github.com/alexwarstadt/blimp

GMM experiments on BLiMP (Figure 2 and Appendix A)

PYTHONPATH=. time python scripts/blimp_anomaly.py \
  --bnc_path=data/bnc.pkl \
  --blimp_path=data/blimp/data/ \
  --out=blimp_result

Frequency correlation (Figure 3 and Appendix B)

Run the notebooks/FreqSurprisal.ipynb notebook.

Surprisal gap experiments (Figure 4)

PYTHONPATH=. time python scripts/run_surprisal_gaps.py \
  --bnc_path=data/bnc.pkl \
  --out=surprisal_gaps

Accuracy scores (Table 2)

PYTHONPATH=. time python scripts/run_accuracy.py \
  --model_name=roberta-base \
  --anomaly_model=gmm

Run unit tests

PYTHONPATH=. pytest tests

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Related tags

Overview

Layerwise Anomaly

Citation

Dependencies

Setup Instructions

GMM experiments on BLiMP (Figure 2 and Appendix A)

Frequency correlation (Figure 3 and Appendix B)

Surprisal gap experiments (Figure 4)

Accuracy scores (Table 2)

Run unit tests

Owner

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

This is a model made out of Neural Network specifically a Convolutional Neural Network model

DA2Lite is an automated model compression toolkit for PyTorch.

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

A Python package for faster, safer, and simpler ML processes

ZEBRA: Zero Evidence Biometric Recognition Assessment

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Training Cifar-10 Classifier Using VGG16

Official Pytorch implementation of RePOSE (ICCV2021)

Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Parameterized Explainer for Graph Neural Network

Simple implementation of OpenAI CLIP model in PyTorch.

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

Easy to use Python camera interface for NVIDIA Jetson

a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Related tags

Overview

Layerwise Anomaly

Citation

Dependencies

Setup Instructions

GMM experiments on BLiMP (Figure 2 and Appendix A)

Frequency correlation (Figure 3 and Appendix B)

Surprisal gap experiments (Figure 4)

Accuracy scores (Table 2)

Run unit tests

Owner

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

This is a model made out of Neural Network specifically a Convolutional Neural Network model

DA2Lite is an automated model compression toolkit for PyTorch.

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

A Python package for faster, safer, and simpler ML processes

ZEBRA: Zero Evidence Biometric Recognition Assessment

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Training Cifar-10 Classifier Using VGG16

Official Pytorch implementation of RePOSE (ICCV2021)

Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Parameterized Explainer for Graph Neural Network

Simple implementation of OpenAI CLIP model in PyTorch.

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

Easy to use Python camera interface for NVIDIA Jetson

a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .