Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Last update: Oct 03, 2022

Overview

Patches Are All You Need? - ConvMixer

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Official GitHub Link: https://github.com/tmp-iclr/convmixer

Paper Link: https://openreview.net/pdf?id=TVHS5Y4dNvM

Note: Paper is under review for ICLR 2022

Model Architechture

Installation

pip install -q tensorflow-addons

Note: We are using TensorFlow-Addons for using the AdamW optimizer and GeLU activation function.

Results

TensorBoard Link: https://tensorboard.dev/experiment/bkhqOz0RQ1Cv5dwrDQySMQ/

Note: Trained 25 Epochs and got a top-5-accuracy of 64.41%

Future Work

To train on 150 epochs
To train model on ImageNet dataset

Citation

@inproceedings{
anonymous2022patches,
title={Patches Are All You Need?},
author={Anonymous},
booktitle={Submitted to The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=TVHS5Y4dNvM},
note={under review}
}

License

MIT License

Copyright (c) 2021 Sayan Nath

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

You might also like...

Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

195 Dec 30, 2022

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

HiFi-GAN+ This project is an unoffical implementation of the HiFi-GAN+ model for audio bandwidth extension, from the paper Bandwidth Extension is All

134 Dec 30, 2022

BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

BasicRL: easy and fundamental codes for deep reinforcement learning BasicRL is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up. It is

12 Apr 28, 2022

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

79 Dec 28, 2022

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

16 Jul 16, 2022

Releases(0.0.1)

0.0.1(Oct 15, 2021)

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

View the TensorBoard here.

Note: Trained on 25 Epochs.
Source code(tar.gz)
Source code(zip)
convmixer-model.h5(6.94 MB)
convmixer.zip(6.21 MB)
train-logs.csv(2.94 KB)

Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Related tags

Overview

Patches Are All You Need? - ConvMixer

Model Architechture

Installation

Results

Future Work

Citation

License

You might also like...

Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Makes patches from huge resolution .svs slide files using openslide

Patches desktop steam to look like the new steamdeck ui.

Code for "Diffusion is All You Need for Learning on Surfaces"

Releases(0.0.1)

0.0.1(Oct 15, 2021)

Owner

Sayan Nath

In this project we predict the forest cover type using the cartographic variables in the training/test datasets.

Backdoor Attack through Frequency Domain

NeurIPS 2021, self-supervised 6D pose on category level

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

True per-item rarity for Loot

Transformer part of 12th place solution in Riiid! Answer Correctness Prediction

Code for STFT Transformer used in BirdCLEF 2021 competition.

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

Out-of-Town Recommendation with Travel Intention Modeling (AAAI2021)

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

UFPR-ADMR-v2 Dataset

Faster RCNN pytorch windows

MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;