The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Deep Learning transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Facebook Research

Genetic Programming in Python, with a scikit-learn inspired API

Implementation of Axial attention - attending to multi-dimensional data efficiently

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

A micro-game "flappy bird".

Exploring Simple 3D Multi-Object Tracking for Autonomous Driving (ICCV 2021)

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

YOLOv5 detection interface - PyQt5 implementation

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

A Kernel fuzzer focusing on race bugs

Put blind watermark into a text with python

Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks

Semi-supervised Implicit Scene Completion from Sparse LiDAR

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

This porject is intented to build the most accurate model for predicting the porbability of loan default

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).