A simple, unofficial implementation of MAE using pytorch-lightning

Last update: Dec 03, 2022

Related tags

Deep Learning mae-pytorch

Overview

Masked Autoencoders in PyTorch

A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

Currently implements training on CUB and StanfordCars, but is easily extensible to any other image dataset.

Setup

.env">

# Clone the repository
git clone https://github.com/catalys1/mae-pytorch.git
cd mae-pytorch

# Install required libraries (inside a virtual environment preferably)
pip install -r requirements.txt

# Set up .env for path to data
echo "DATADIR=/path/to/data" > .env

Usage

MAE training

Training options are provided through configuration files, handled by LightningCLI. See configs/ for examples.

Train an MAE model on the CUB dataset:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml

Using multiple GPUs:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

Fine-tuning

Not yet implemented.

Implementation

The default model uses ViT-Base for the encoder, and a small ViT (depth=4, width=192) for the decoder. This is smaller than the model used in the paper.

Dependencies

Configuration and training is handled completely by pytorch-lightning.
The MAE model uses the VisionTransformer from timm.
Interface to FGVC datasets through fgvcdata.
Configurable environment variables through python-dotenv.

Results

Image reconstructions of CUB validation set images after training with the following command:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

A simple, unofficial implementation of MAE using pytorch-lightning

Related tags

Overview

Masked Autoencoders in PyTorch

Setup

Usage

MAE training

Fine-tuning

Implementation

Dependencies

Results

Owner

Connor Anderson

CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

A deep learning model for style-specific music generation.

The Simplest DCGAN Implementation

Spectral Tensor Train Parameterization of Deep Learning Layers

Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Discovering Interpretable GAN Controls [NeurIPS 2020]

Huawei Hackathon 2021 - Sweden (Stockholm)

A working implementation of the Categorical DQN (Distributional RL).

Creating multimodal multitask models

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

Neural Ensemble Search for Performant and Calibrated Predictions

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

Keras implementation of the GNM model in paper ’Graph-Based Semi-Supervised Learning with Nonignorable Nonresponses‘

Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

Python-experiments - A Repository which contains python scripts to automate things and make your life easier with python

Boundary IoU API (Beta version)

PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.