PyTorch implementation of SIFT descriptor

Last update: Dec 24, 2022

Overview

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can be used for descriptop-based learning shape of affine feature.

UPD 08/2019 : pytorch-sift is added to kornia and available by kornia.features.SIFTDescriptor

There are different implementations of the SIFT on the web. I tried to match Michal Perdoch implementation, which gives high quality features for image retrieval CVPR2009. However, on planar datasets, it is inferior to vlfeat implementation. The main difference is gaussian weighting window parameters, so I have made a vlfeat-like version too. MP version weights patch center much more (see image below, left) and additionally crops everything outside the circular region. Right is vlfeat version

descriptor_mp_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'hesamp',
                        masktype='CircularGauss')

descriptor_vlfeat_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'vlfeat',
                        masktype='Gauss')

Results:

OPENCV-SIFT - mAP 
   Easy     Hard      Tough     mean
-------  -------  ---------  -------
0.47788  0.20997  0.0967711  0.26154

VLFeat-SIFT - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.466584  0.203966  0.0935743  0.254708

PYTORCH-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.472563  0.202458  0.0910371  0.255353

NUMPY-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.449431  0.197918  0.0905395  0.245963

PYTORCH-SIFT-MP-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.430887  0.184834  0.0832707  0.232997

NUMPY-SIFT-MP-65 - mAP 
    Easy     Hard      Tough      mean
--------  -------  ---------  --------
0.417296  0.18114  0.0820582  0.226832

Speed:

0.00246 s per 65x65 patch - numpy SIFT
0.00028 s per 65x65 patch - C++ SIFT
0.00074 s per 65x65 patch - CPU, 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch

If you use this code for academic purposes, please cite the following paper:

@InProceedings{AffNet2018,
    title = {Repeatability Is Not Enough: Learning Affine Regions via Discriminability},
    author = {Dmytro Mishkin, Filip Radenovic, Jiri Matas},
    booktitle = {Proceedings of ECCV},
    year = 2018,
    month = sep
}

PyTorch implementation of SIFT descriptor

Related tags

Overview

Owner

Dmytro Mishkin

Deep learning model for EEG artifact removal

Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Semi-supervised Domain Adaptation via Minimax Entropy

Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

A modular domain adaptation library written in PyTorch.

Image Captioning on google cloud platform based on iot

ThunderSVM: A Fast SVM Library on GPUs and CPUs

DeLighT: Very Deep and Light-Weight Transformers

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

VideoGPT: Video Generation using VQ-VAE and Transformers

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

Self-attentive task GAN for space domain awareness data augmentation.

Lava-DL, but with PyTorch-Lightning flavour

This package contains a PyTorch Implementation of IB-GAN of the submitted paper in AAAI 2021

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

Spectrum is an AI that uses machine learning to generate Rap song lyrics

VIsually-Pivoted Audio and(N) Text

A wrapper around SageMaker ML Lineage Tracking extending ML Lineage to end-to-end ML lifecycles, including additional capabilities around Feature Store groups, queries, and other relevant artifacts.