Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Last update: Sep 20, 2022

Related tags

Overview

Skyformer

This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Requirements

To install requirements in a conda environment:

conda create -n skyformer python=3.6
conda activate skyformer
pip install -r requirements.txt

Note: Specific requirements for data preprocessing are not included here.

Data Preparation

Processed files can be downloaded here, or processed with the following steps:

Requirements

tensorboard>=2.3.0
tensorflow>=2.3.1
tensorflow-datasets>=4.0.1

Download the TFDS files for pathfinder and then set _PATHFINER_TFDS_PATH to the unzipped directory (following https://github.com/google-research/long-range-arena/issues/11)
Download lra_release.gz (7.7 GB).
Unzip lra-release and put under ./data/.

cd data
wget https://storage.googleapis.com/long-range-arena/lra_release.gz
tar zxvf lra-release.gz

Create a directory lra_processed under ./data/.

mkdir lra_processed
cd ..

6.The directory structure would be (assuming the root dir is code)

./data/lra-processed
./data/long-range-arena-main
./data/lra_release

Create train, dev, and test dataset pickle files for each task.

cd preprocess
python create_pathfinder.py
python create_listops.py
python create_retrieval.py
python create_text.py
python create_cifar10.py

Note: most source code comes from LRA repo.

Run

Modify the configuration in config.py and run

python main.py --mode train --attn skyformer --task lra-text

mode: train, eval
attn: softmax, nystrom, linformer, reformer, perfromer, informer, bigbird, kernelized, skyformer
task: lra-listops, lra-pathfinder, lra-retrieval, lra-text, lra-image

Reference

@inproceedings{Skyformer,
    title={Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method}, 
    author={Yifan Chen and Qi Zeng and Heng Ji and Yun Yang},
    booktitle={NeurIPS},
    year={2021}
}

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Related tags

Overview

Skyformer

Requirements

Data Preparation

Run

Reference

Owner

Qi Zeng

百度2021年语言与智能技术竞赛机器阅读理解Pytorch版baseline

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

Warning: This project does not have any current developer. See bellow.

Training RNNs as Fast as CNNs

Specificity-preserving RGB-D Saliency Detection

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ)

Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

A curated list of awesome game datasets, and tools to artificial intelligence in games

基于PaddleClas实现垃圾分类，并转换为inference格式用PaddleHub服务端部署

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

Implementation of the pix2pix model on satellite images

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification