An implementation of the efficient attention module.

Last update: Dec 15, 2022

Overview

Efficient Attention

An implementation of the efficient attention module.

Description

Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining exactly the same expressive power as the conventional dot-product attention. The illustration above compares the two types of attention. The efficient attention module is a drop-in replacement for the non-local module (Wang et al., 2018), while it:

uses less resources to achieve the same accuracy;
achieves higher accuracy with the same resource constraints (by allowing more insertions); and
is applicable in domains and models where the non-local module is not (due to resource constraints).

Resources

YouTube:

Presentation: https://youtu.be/_wnjhTM04NM

bilibili (for users in Mainland China):

Presentation: https://www.bilibili.com/video/BV1tK4y1f7Rm
Presentation in Chinese: https://www.bilibili.com/video/bv1Gt4y1Y7E3

Implementation details

This repository implements the efficient attention module with softmax normalization, output reprojection, and residual connection.

Features not in the paper

This repository implements additionally implements the multi-head mechanism which was not in the paper. To learn more about the mechanism, refer to Vaswani et al.

Citation

The paper will appear at WACV 2021. If you use, compare with, or refer to this work, please cite

@inproceedings{shen2021efficient,
    author = {Zhuoran Shen and Mingyuan Zhang and Haiyu Zhao and Shuai Yi and Hongsheng Li},
    title = {Efficient Attention: Attention with Linear Complexities},
    booktitle = {WACV},
    year = {2021},
}

An implementation of the efficient attention module.

Related tags

Overview

Efficient Attention

Description

Resources

Implementation details

Features not in the paper

Citation

Owner

Shen Zhuoran

A Python library for common tasks on 3D point clouds

Example of a Quantum LSTM

Keyword spotting on Arm Cortex-M Microcontrollers

This is a official repository of SimViT.

Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

TorchX: A PyTorch Extension Library for More Efficient Deep Learning

A visualisation tool for Deep Reinforcement Learning

Demonstration of the Model Training as a CI/CD System in Vertex AI

Instant neural graphics primitives: lightning fast NeRF and more

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning.

A robotic arm that mimics hand movement through MediaPipe tracking.

An end-to-end image translation model with weight-map for color constancy

Predicting Tweet Sentiment Maching Learning and streamlit

Code for the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness"

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

MAterial del programa Misión TIC 2022

This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''