Bottleneck Transformers for Visual Recognition

Last update: Jan 03, 2023

Overview

Bottleneck Transformers for Visual Recognition

Experiments

Model	Params (M)	Acc (%)
ResNet50 baseline (ref)	23.5M	93.62
BoTNet-50	18.8M	95.11%
BoTNet-S1-50	18.8M	95.67%
BoTNet-S1-59	27.5M	95.98%
BoTNet-S1-77	44.9M	wip

Summary

Usage (example)

Model

from model import Model

model = ResNet50(num_classes=1000, resolution=(224, 224))
x = torch.randn([2, 3, 224, 224])
print(model(x).size())

Module

from model import MHSA

resolution = 14
mhsa = MHSA(planes, width=resolution, height=resolution)

Reference

Paper link
Author: Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
Organization: UC Berkeley, Google Research

Owner

Myeongjun Kim

Computer Vision Research using Deep Learning

GitHub Repository

PCGNN - Procedural Content Generation with NEAT and Novelty

PCGNN - Procedural Content Generation with NEAT and Novelty Generation Approach — Metrics — Paper — Poster — Examples PCGNN - Procedural Content Gener

8 Dec 10, 2022

This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021

Disentangling Label Distribution for Long-tailed Visual Recognition (CVPR 2021) Arxiv link Blog post This codebase is built on Causal Norm. Install co

85 Oct 18, 2022

A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

16 Dec 06, 2019

Implementation of UNET architecture for Image Segmentation.

Semantic Segmentation using UNET This is the implementation of UNET on Carvana Image Masking Kaggle Challenge About the Dataset This dataset contains

4 Dec 21, 2021

Categorizing comments on YouTube into different categories.

Youtube Comments Categorization This repo is for categorizing comments on a youtube video into different categories. negative (grievances, complaints,

5 Nov 26, 2022

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

STAR-FC This code is the implementation for the CVPR 2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes" 🌟 🌟 . 🎓 Re

87 Dec 28, 2022

Multi-Stage Progressive Image Restoration

Multi-Stage Progressive Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Sh

859 Dec 22, 2022

Learning To Have An Ear For Face Super-Resolution

Learning To Have An Ear For Face Super-Resolution [Project Page] This repository contains demo code of our CVPR2020 paper. Training and evaluation on

50 Nov 16, 2022

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

124 Jan 06, 2023

Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector Introduction This is the source code of article: Towards Toxic and Narcot

3 Oct 29, 2022

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

MeMOT - Pytorch (wip) Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch. This paper is just one in a line of work, but importan

15 May 09, 2022

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

36 Nov 04, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ

187 Oct 01, 2022

An LSTM for time-series classification

Update 10-April-2017 And now it works with Python3 and Tensorflow 1.1.0 Update 02-Jan-2017 I updated this repo. Now it works with Tensorflow 0.12. In

391 Dec 27, 2022

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

Interpreting Language Models Through Knowledge Graph Extraction Idea: How do we interpret what a language model learns at various stages of training?

9 Oct 25, 2022

PyJokes - Joking around with Python library pyjokes

Hi, it's Muhaimin again 👋 This is something unorthodox but cool. Don't forget t

1 Feb 02, 2022

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

5 Nov 05, 2022

Extending JAX with custom C++ and CUDA code

Extending JAX with custom C++ and CUDA code This repository is meant as a tutorial demonstrating the infrastructure required to provide custom ops in

237 Dec 23, 2022

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Robust Reflection Removal with Reflection-free Flash-only Cues (RFC) Paper | To be released: Project Page | Video | Data Tensorflow implementation for

162 Jan 05, 2023

Bottleneck Transformers for Visual Recognition

Related tags

Overview

Bottleneck Transformers for Visual Recognition

Experiments

Summary

Usage (example)

Reference

Owner

Myeongjun Kim

PCGNN - Procedural Content Generation with NEAT and Novelty

This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021

A collection of semantic image segmentation models implemented in TensorFlow

Implementation of UNET architecture for Image Segmentation.

Categorizing comments on YouTube into different categories.

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

Multi-Stage Progressive Image Restoration

Learning To Have An Ear For Face Super-Resolution

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

An LSTM for time-series classification

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

PyJokes - Joking around with Python library pyjokes

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Extending JAX with custom C++ and CUDA code

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"