MLP-Mixer: An all-MLP Architecture for Vision

This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision.

Usage :

import torch
import numpy as np
from mlp-mixer import MLPMixer

img = torch.ones([1, 3, 224, 224])

model = MLPMixer(in_channels=3, image_size=224, patch_size=16, num_classes=1000,
                 dim=512, depth=8, token_dim=256, channel_dim=2048)

parameters = filter(lambda p: p.requires_grad, model.parameters())
parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
print('Trainable Parameters: %.3fM' % parameters)

out_img = model(img)

print("Shape of out :", out_img.shape)  # [B, in_channels, image_size, image_size]

Citation :

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Some component borrowed from ViT code of @lucidrains repo : https://github.com/lucidrains/vit-pytorch

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP-Mixer: An all-MLP Architecture for Vision

Usage :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

Duke Machine Learning Winter School: Computer Vision 2022

This is a model made out of Neural Network specifically a Convolutional Neural Network model

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

This is an official implementation for "Self-Supervised Learning with Swin Transformers".

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

On-device wake word detection powered by deep learning.

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Unit-Convertor - Unit Convertor Built With Python

Evaluation suite for large-scale language models.

PyTorch implementation HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

[TPDS'21] COSCO: Container Orchestration using Co-Simulation and Gradient Based Optimization for Fog Computing Environments

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

DenseNet Implementation in Keras with ImageNet Pretrained Models

pixelNeRF: Neural Radiance Fields from One or Few Images

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0