Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Last update: Dec 25, 2022

Overview

Segformer - Pytorch

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch.

Install

$ pip install segformer-pytorch

Usage

For example, MiT-B0

import torch
from segformer_pytorch import Segformer

model = Segformer(
    patch_size = 4,                 # patch size
    dims = (32, 64, 160, 256),      # dimensions of each stage
    heads = (1, 2, 5, 8),           # heads of each stage
    ff_expansion = (8, 8, 4, 4),    # feedforward expansion factor of each stage
    reduction_ratio = (8, 4, 2, 1), # reduction ratio of each stage for efficient attention
    num_layers = 2,                 # num layers of each stage
    decoder_dim = 256,              # decoder dimension
    num_classes = 4                 # number of segmentation classes
)

x = torch.randn(1, 3, 256, 256)
pred = model(x) # (1, 4, 64, 64)  # output is (H/4, W/4) map of the number of segmentation classes

Make sure the keywords are at most a tuple of 4, as this repository is hard-coded to give the MiT 4 stages as done in the paper.

Citations

@misc{xie2021segformer,
    title   = {SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers}, 
    author  = {Enze Xie and Wenhai Wang and Zhiding Yu and Anima Anandkumar and Jose M. Alvarez and Ping Luo},
    year    = {2021},
    eprint  = {2105.15203},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

You might also like...

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

383 Jan 2, 2023

Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

2 Sep 29, 2022

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

71 Dec 30, 2022

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

162 Nov 28, 2022

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

194 Dec 28, 2022

Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

24 Mar 23, 2022

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Mixer Unoffical Implementation of MLP-Mixer, easy to use with terminal. Train and test easly. https://arxiv.org/abs/2105.01601 MLP-Mixer is an arc

6 Dec 5, 2022

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

1 Jan 1, 2022

Comments

Something is wrong with your implementation.

Hello!

First of all, I really like the repo. The implementation is clean and so much easier to understand than the official repo. But after doing some digging, I realized that the number of parameters and layers (especially conv2d) is quite different from the official implementation. This is the case for all variants I have tested (B0 and B5).

Check out the README in my repo here, and you'll see what I mean. I also included images of the execution graphs of the two different implementations in the 'src' folder, which could help to debug.

I don't quite have time to dig into the source of the problem, but I just thought I'd share my observations with you.

opened by camlaedtke 0
Models weights + model output HxW

Hi,

Could you please add the models weights so we can start training from them?

Also, why you choose to train models with an output of size (H/4,W/4) and not the original (HxW) size?

Great job for the paper, very interesting :)

opened by isega24 2
The model configurations for all the SegFormer B0 ~ B5

Hello How are you? Thanks for contributing to this project. Is the model configuration in README MiT-B0 correctly? That's because the total number of params for the model is 36M. Could u provide all the model configurations for SegFormer B0 ~ B5?

opened by rose-jinyang 5

a question about kv reshape in Efficient Self-Attention

Thanks for sharing your work, your code is so elegant, and inspired me a lot. Here is a question about the implementation of Efficient Self-Attention

It seems you use a "mean op" to reshape k,v. and the official implementation uses a (learnable) linear mapping to reshape k,v

may I ask, whether this difference significantly matters in your experiment ?

in your code:

k, v = map(lambda t: reduce(t, 'b c (h r1) (w r2) -> b c h w', 'mean', r1 = r, r2 = r), (k, v))

the original implementation uses:

self.kv = nn.Linear(dim, dim * 2, bias=qkv_bias)
self.sr = nn.Conv2d(dim, dim, kernel_size=sr_ratio, stride=sr_ratio)
self.norm = nn.LayerNorm(dim)

x_ = x.permute(0, 2, 1).reshape(B, C, H, W)
x_ = self.sr(x_).reshape(B, C, -1).permute(0, 2, 1)
x_ = self.norm(x_)
kv = self.kv(x_).reshape(B, -1, 2, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
k, v = kv[0], kv[1]

opened by masszhou 1

Releases(0.0.6)

0.0.6(Aug 24, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.4(Jun 23, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3(Jun 19, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2(Jun 11, 2021)

Source code(tar.gz)
Source code(zip)
0.0.1b(Jun 11, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention

GitHub Repository

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022

The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

46.9k Jan 03, 2023

TensorFlow Tutorials with YouTube Videos

TensorFlow Tutorials Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction These tutorials are intended for beginne

9.1k Jan 02, 2023

Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

Net2Net Code accompanying the NeurIPS 2020 oral paper Network-to-Network Translation with Conditional Invertible Neural Networks Robin Rombach*, Patri

206 Dec 20, 2022

This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murder rates etc.

Gun-Laws-Classifier This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murde

1 Jan 20, 2022

Audio Visual Emotion Recognition using TDA

Audio Visual Emotion Recognition using TDA RAVDESS database with two datasets analyzed: Video and Audio dataset: Audio-Dataset: https://www.kaggle.com

3 May 11, 2022

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021 How to cite If you use these data please cite the o

2 Dec 20, 2021

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥

FinRL-Meta: A Universe of Market Environments. FinRL-Meta is a universe of market environments for data-driven financial reinforcement learning. Users

543 Jan 08, 2023

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

APPNP ⠀ A PyTorch implementation of Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019). Abstract Neural message pass

329 Dec 30, 2022

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Related tags

Overview

Segformer - Pytorch

Install

Usage

Citations

You might also like...

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Pytorch implementation of MLP-Mixer with loading pre-trained models.

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

Unofficial Implementation of MLP-Mixer in TensorFlow

Implementation of "A MLP-like Architecture for Dense Prediction"

Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

Comments

Something is wrong with your implementation.

Models weights + model output HxW

The model configurations for all the SegFormer B0 ~ B5

a question about kv reshape in Efficient Self-Attention

Releases(0.0.6)

0.0.6(Aug 24, 2021)

0.0.5(Jun 30, 2021)

0.0.4(Jun 23, 2021)

0.0.3(Jun 19, 2021)

0.0.2(Jun 11, 2021)

0.0.1b(Jun 11, 2021)

Owner

Phil Wang

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

The world's simplest facial recognition api for Python and the command line

TensorFlow Tutorials with YouTube Videos

Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murder rates etc.

Audio Visual Emotion Recognition using TDA

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

ShapeGlot: Learning Language for Shape Differentiation

Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Randomizes the warps in a stock pokeemerald repo.

Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Elevation Mapping on GPU.

Image-to-image translation with conditional adversarial nets

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥