Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Overview

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

Tensorflow code and models for the paper:

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
CVPR 2018

This repository contains code and pre-trained models used in the paper and 2 demos to demonstrate: 1) the importance of pre-training data on transfer learning; 2) how to calculate domain similarity between source domain and target domain.

Notice that we used a mini validation set (./inat_minival.txt) contains 9,697 images that are randomly selected from the original iNaturalist 2017 validation set. The rest of valdiation images were combined with the original training set to train our model in the paper. There are 665,473 training images in total.

Dependencies:

Preparation:

  • Clone the repo with recursive:
git clone --recursive https://github.com/richardaecn/cvpr18-inaturalist-transfer.git
  • Install dependencies. Please refer to TensorFlow, pyemd, scikit-learn and scikit-image official websites for installation guide.
  • Download data and feature and unzip them into the same directory as the cloned repo. You should have two folders './data' and './feature' in the repo's directory.

Datasets (optional):

In the paper, we used data from 9 publicly available datasets:

We provide a download link that includes the entire CUB-200-2011 dataset and data splits for the rest of 8 datasets. The provided link contains sufficient data for this repo. If you would like to use other 8 datasets, please download them from the official websites and put them in the corresponding subfolders under './data'.

Pre-trained Models (optional):

The models were trained using TensorFlow-Slim. We implemented Squeeze-and-Excitation Networks (SENet) under './slim'. The pre-trained models can be downloaded from the following links:

Network Pre-trained Data Input Size Download Link
Inception-V3 ImageNet 299 link
Inception-V3 iNat2017 299 link
Inception-V3 iNat2017 448 link
Inception-V3 iNat2017 299 -> 560 FT1 link
Inception-V3 ImageNet + iNat2017 299 link
Inception-V3 SE ImageNet + iNat2017 299 link
Inception-V4 iNat2017 448 link
Inception-V4 iNat2017 448 -> 560 FT2 link
Inception-ResNet-V2 ImageNet + iNat2017 299 link
Inception-ResNet-V2 SE ImageNet + iNat2017 299 link
ResNet-V2 50 ImageNet + iNat2017 299 link
ResNet-V2 101 ImageNet + iNat2017 299 link
ResNet-V2 152 ImageNet + iNat2017 299 link

1 This model was trained with 299 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

2 This model was trained with 448 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

TensorFlow Hub also provides a pre-trained Inception-V3 299 on iNat2017 original training set here.

Featrue Extraction (optional):

Run the following Python script to extract feature:

python feature_extraction.py

To run this script, you need to download the checkpoint of Inception-V3 299 trained on iNat2017. The dataset and pre-trained model can be modified in the script.

We provide a download link that includes features used in the domos of this repo.

Demos

  1. Linear logistic regression on extracted features:

This demo shows the importance of pre-training data on transfer learning. Based on features extracted from an Inception-V3 pre-trained on iNat2017, we are able to achieve 89.9% classification accuracy on CUB-200-2011 with the simple logistic regression, outperforming most state-of-the-art methods.

LinearClassifierDemo.ipynb
  1. Calculating domain similarity by Earth Mover's Distance (EMD): This demo gives an example to calculate the domain similarity proposed in the paper. Results correspond to part of the Fig. 5 in the original paper.
DomainSimilarityDemo.ipynb

Training and Evaluation

  • Convert dataset into '.tfrecord':
python convert_dataset.py --dataset_name=cub_200 --num_shards=10
  • Train (fine-tune) the model on 1 GPU:
CUDA_VISIBLE_DEVICES=0 ./train.sh
  • Evaluate the model on another GPU simultaneously:
CUDA_VISIBLE_DEVICES=1 ./eval.sh
  • Run Tensorboard for visualization:
tensorboard --logdir=./checkpoints/cub_200/ --port=6006

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{Cui2018iNatTransfer,
  title = {Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning},
  author = {Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie},
  booktitle={CVPR},
  year={2018}
}
Owner
Yin Cui
Research Scientist at Google
Yin Cui
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua

Kyuhee Jo 5 Nov 24, 2021
From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Code for CCA-SSG model proposed in the NeurIPS 2021 paper From Canonical Correlation Analysis to Self-supervised Graph Neural Networks.

Hengrui Zhang 44 Nov 27, 2022
Pull sensitive data from users on windows including discord tokens and chrome data.

⭐ For a 🍪 Pegasus Pull sensitive data from users on windows including discord tokens and chrome data. Features 🟩 Discord tokens 🟩 Geolocation data

Addi 44 Dec 31, 2022
The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

TempleX 9 Jul 30, 2022
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

61.4k Jan 04, 2023
Experiments for Operating Systems Lab (ETCS-352)

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

Deekshant Wadhwa 0 Sep 06, 2022
Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

From global to local MDI variable importances for random forests and when they are Shapley values Antonio Sutera ( Antonio Sutera 3 Feb 23, 2022

Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

Michael Galkin 107 Jan 04, 2023
Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

SUSE Cloud Native Foundations Scholarship Udacity is collaborating with SUSE, a global leader in true open source solutions, to empower developers and

Shivansh Srivastava 34 Oct 18, 2022
Unbiased Learning To Rank Algorithms (ULTRA)

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels.

71 Dec 01, 2022
Training Structured Neural Networks Through Manifold Identification and Variance Reduction

Training Structured Neural Networks Through Manifold Identification and Variance Reduction This repository is a pytorch implementation of the Regulari

0 Dec 23, 2021
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
Code for Ditto: Building Digital Twins of Articulated Objects from Interaction

Ditto: Building Digital Twins of Articulated Objects from Interaction Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu CVPR 2022, Oral Project | arxiv News 2022

UT Robot Perception and Learning Lab 78 Dec 22, 2022
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

ASGNet The code is for the paper "Adaptive Prototype Learning and Allocation for Few-Shot Segmentation" (accepted to CVPR 2021) [arxiv] Overview data/

Gen Li 91 Dec 23, 2022
Transport Mode detection - can detect the mode of transport with the help of features such as acceeration,jerk etc

title emoji colorFrom colorTo sdk app_file pinned Transport_Mode_Detector 🚀 purple yellow gradio app.py false Configuration title: string Display tit

Nishant Rajadhyaksha 3 Jan 16, 2022
BLEURT is a metric for Natural Language Generation based on transfer learning.

BLEURT: a Transfer Learning-Based Metric for Natural Language Generation BLEURT is an evaluation metric for Natural Language Generation. It takes a pa

Google Research 492 Jan 05, 2023
Multiple-Object Tracking with Transformer

TransTrack: Multiple-Object Tracking with Transformer Introduction TransTrack: Multiple-Object Tracking with Transformer Models Training data Training

Peize Sun 537 Jan 04, 2023
A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

#360Diffusion automatically upscales your CLIP Guided Diffusion outputs using Real-ESRGAN. Latest Update: Alpha 1.61 [Main Branch] - 01/11/22 Layout a

78 Nov 02, 2022