A Novel Plug-in Module for Fine-grained Visual Classification

Last update: Dec 20, 2022

Overview

A Novel Plug-in Module for Fine-grained Visual Classification

paper url: https://arxiv.org/abs/2202.03822

We propose a novel plug-in module that can be integrated to many common backbones, including CNN-based or Transformer-based networks to provide strongly discriminative regions. The plugin module can output pixel-level feature maps and fuse filtered features to enhance fine-grained visual classification. Experimental results show that the proposed plugin module outperforms state-ofthe-art approaches and significantly improves the accuracy to 92.77% and 92.83% on CUB200-2011 and NABirds, respectively.

1. Environment setting

install requirements
replace folder timm/ to our timm/ folder (for ViT or Swin-T)

Prepare dataset

In this paper, we use 2 large bird's datasets:

Our pretrained model

Download the pretrained model from this url: https://drive.google.com/drive/folders/1ivMJl4_EgE-EVU_5T8giQTwcNQ6RPtAo?usp=sharing

backup/ is our pretrained model path.
resnet50_miil_21k.pth and vit_base_patch16_224_miil_21k.pth are imagenet21k pretrained model (place these file under models/), thanks to https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/MODEL_ZOO.md !!

OS

Windows10
Ubuntu20.04
macOS

2. Train

configuration file: config.py

python train.py --train_root "./CUB200-2011/train/" --val_root "./CUB200-2011/test/"

3. Evaluation

configuration file: config_eval.py

python eval.py --pretrained_path "./backup/CUB200/best.pth" --val_root "./CUB200-2011/test/"

4. Visualization

configuration file: config_plot.py

python plot_heat.py --pretrained_path "./backup/CUB200/best.pth" --img_path "./img/001.png/"

Acknowledgment

Thanks to timm for Pytorch implementation.
This work was financially supported by the National Taiwan Normal University (NTNU) within the framework of the Higher Education Sprout Project by the Ministry of Education(MOE) in Taiwan, sponsored by Ministry of Science and Technology, Taiwan, R.O.C. under Grant no. MOST 110- 2221-E-003-026, 110-2634-F-003 -007, and 110-2634-F-003 -006. In addition, we thank to National Center for Highperformance Computing (NCHC) for providing computational and storage resources.

A Novel Plug-in Module for Fine-grained Visual Classification

Related tags

Overview

A Novel Plug-in Module for Fine-grained Visual Classification

1. Environment setting

Prepare dataset

Our pretrained model

OS

2. Train

3. Evaluation

4. Visualization

Acknowledgment

Owner

ChouPoYung

Conversion between units used in magnetism

This is a simple face recognition mini project that was completed by a team of 3 members in 1 week's time

这是一个deeplabv3-plus-pytorch的源码，可以用于训练自己的模型。

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

AutoDeeplab / auto-deeplab / AutoML for semantic segmentation, implemented in Pytorch

FairMOT - A simple baseline for one-shot multi-object tracking

DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

Hyperbolic Hierarchical Clustering.

Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

MEND: Model Editing Networks using Gradient Decomposition

A stable algorithm for GAN training

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

Efficient face emotion recognition in photos and videos

A library for optimization on Riemannian manifolds

An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation