Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Related tags

Deep LearningWAKD
Overview

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Introduction

WAKD is a PyTorch implementation for our ICPR-2022 paper "Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation".

Installation

We test this repo with Python 3.8, PyTorch 1.9.0, and CUDA 10.2. But it should be runnable with recent PyTorch versions (Pytorch >=1.0.0).

python setup.py develop

Preparation

Datasets

We test our models on three geo-localization benchmarks, Pittsburgh, Tokyo 24/7 and Tokyo Time Machine datasets. The three datasets can be downloaded at here.

The directory of datasets used is like

datasets/data
├── pitts
│   ├── raw
│   │   ├── pitts250k_test.mat
│   │   ├── pitts250k_train.mat
│   │   ├── pitts250k_val.mat
│   │   ├── pitts30k_test.mat
│   │   ├── pitts30k_train.mat
│   │   ├── pitts30k_val.mat
│   └── └── Pittsburgh
│           ├──images/
│           └──queries/
└── tokyo
    ├── raw
    │   ├── tokyo247
    │   │   ├──images/
    │   │   └──query/
    │   ├── tokyo247.mat
    │   ├── tokyoTM/images/
    │   ├── tokyoTM_train.mat
    └── └── tokyoTM_val.mat

Pre-trained Weights

The file tree we used for storing the pre-trained weights is like

logs
├── vgg16_pretrained.pth.tar # refer to (1)
├── mbv3_large.pth.tar
└── vgg16_pitts_64_desc_cen.hdf5 # refer to (2)
└── mobilenetv3_large_pitts_64_desc_cen.hdf5

(1) ImageNet-pretrained weights for CNNs backbone

The ImageNet-pretrained weights for CNNs backbone or the pretrained weights for the whole model.

(2) initial cluster centers for VLAD layer

Note that the VLAD layer cannot work with random initialization. The original cluster centers provided by NetVLAD or self-computed cluster centers by running the scripts/cluster.sh.

./scripts/cluster.sh mobilenetv3_large

Training

Train by running script in the terminal. Script location: scripts/train_wakd_st.sh

Format:

bash scripts/train_wakd_st.sh arch archT

where, arch is the backbone name, such as mobilenetv3_large. archT is the teacher backbone name, such as vgg16.

For example:

bash scripts/train_wakd_st.sh mobilenetv3_large vgg16

In the train_wakd_st.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --tuple-size for more tuples on one GPU. In case your GPU does not have enough memory, reduce --pos-num or --neg-num for fewer positives or negatives in one tuple.

Testing

Test by running script in the terminal. Script location: scripts/test.sh

Format:

bash scripts/test.sh resume arch dataset scale

where, resume is the trained model path. arch is the backbone name, such as vgg16, mobilenetv3_large and resnet152. dataset scale, such as pitts 30k and pitts 250k.

For example:

  1. Test mobilenetv3_large on pitts 250k:
bash scripts/test.sh logs/netVLAD/pitts30k-mobilenetv3_large/model_best.pth.tar mobilenetv3_large pitts 250k
  1. Test vgg16 on tokyo:
bash scripts/test.sh logs/netVLAD/pitts30k-vgg16/model_best.pth.tar model_best.pth.tar vgg16 tokyo

In the test.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --test-batch-size for larger batch size on one GPU. In case your GPU does not have enough memory, reduce --test-batch-size for smaller batch size on one GPU.

Acknowledgements

We truely thanksful of the following two piror works. Particularly, part of the code is inspired by [pytorch-NetVlad]

  • NetVLAD: CNN architecture for weakly supervised place recognition (CVPR'16) [paper] [pytorch-NetVlad]
  • SARE: Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization (ICCV'19) [paper] [deepIBL]
Long Expressive Memory (LEM)

Long Expressive Memory for Sequence Modeling This repository contains the implementation to reproduce the numerical experiments of the paper Long Expr

Konstantin Rusch 47 Dec 17, 2022
JAXDL: JAX (Flax) Deep Learning Library

JAXDL: JAX (Flax) Deep Learning Library Simple and clean JAX/Flax deep learning algorithm implementations: Soft-Actor-Critic (arXiv:1812.05905) Transf

Patrick Hart 4 Nov 27, 2022
Some useful blender add-ons for SMPL skeleton's poses and global translation.

Blender add-ons for SMPL skeleton's poses and trans There are two blender add-ons for SMPL skeleton's poses and trans.The first is for making an offli

犹在镜中 154 Jan 04, 2023
Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

UNet++: A Nested U-Net Architecture for Medical Image Segmentation UNet++ is a new general purpose image segmentation architecture for more accurate i

Zongwei Zhou 1.8k Dec 27, 2022
Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

Unofficial pytorch implementation of the paper "Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective"

16 Nov 21, 2022
code for our ECCV 2020 paper "A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation"

Code for our ECCV (2020) paper A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation. Prerequisites: python == 3.6.8 pytorch ==1.1.0

32 Nov 27, 2022
Code for the published paper : Learning to recognize rare traffic sign

Improving traffic sign recognition by active search This repo contains code for the paper : "Learning to recognise rare traffic signs" How to use this

samsja 4 Jan 05, 2023
[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation

Target Adaptive Context Aggregation for Video Scene Graph Generation This is a PyTorch implementation for Target Adaptive Context Aggregation for Vide

Multimedia Computing Group, Nanjing University 44 Dec 14, 2022
A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

Imaging Transcriptomics Imaging transcriptomics is a methodology that allows to identify patterns of correlation between gene expression and some prop

Alessio Giacomel 10 Dec 27, 2022
Diffgram - Supervised Learning Data Platform

Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning

Diffgram 1.6k Jan 07, 2023
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

Guochen Yu 36 Dec 02, 2022
Auto-Encoding Score Distribution Regression for Action Quality Assessment

DAE-AQA It is an open source program reference to paper Auto-Encoding Score Distribution Regression for Action Quality Assessment. 1.Introduction DAE

13 Nov 16, 2022
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022
High accurate tool for automatic faces detection with landmarks

faces_detanator High accurate tool for automatic faces detection with landmarks. The library is based on public detectors with high accuracy (TinaFace

Ihar 7 May 10, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Get Fooled for the Right Reason Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness throu

Sowrya Gali 1 Apr 25, 2022
Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

CyGNet This repository reproduces the AAAI'21 paper “Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Network

CunchaoZ 89 Jan 03, 2023
Discovering and Achieving Goals via World Models

Discovering and Achieving Goals via World Models [Project Website] [Benchmark Code] [Video (2min)] [Oral Talk (13min)] [Paper] Russell Mendonca*1, Ole

Oleg Rybkin 71 Dec 22, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022