Alignment Attention Fusion framework for Few-Shot Object Detection

Overview

AAF framework

Framework generalities

This repository contains the code of the AAF framework proposed in this paper. The main idea behind this work is to propose a flexible framework to implement various attention mechanisms for Few-Shot Object Detection. The framework is composed of 3 different modules: Spatial Alignment, Global Attention and Fusion Layer, which are applied successively to combine features from query and support images.

The inputs of the framework are:

  • query_features List[Tensor(B, C, H, W)]: Query features at different levels. For each level, the features are of shape Batch x Channels x Height x Width.
  • support_features List[Tensor(N, C, H', W')] : Support features at different level. First dimension correspond to the number of support images, regrouped by class: N = N_WAY * K_SHOT.
  • support_targets List[BoxList] bounding boxes for object in each support image.

The framework can be configured using a separate config file. Examples of such files are available under /config_files/aaf_framework/. The structure of these files is simple:

ALIGN_FIRST: #True/False Run Alignment before Attention when True
OUT_CH: # Number of features output by the fusion layer
ALIGNMENT:
    MODE: # Name of the alignment module selected
ATTENTION:
    MODE: # Name of the attention module selected
FUSION:
    MODE: # Name of the fusion module selected
File name Method Alignment Attention Fusion
identity.yaml Identity IDENTITY IDENTITY IDENTITY
feature_reweighting.yaml FSOD via feature reweighting IDENTITY REWEIGHTING_BATCH IDENTITY
meta_faster_rcnn.yaml Meta Faster-RCNN SIMILARITY_ALIGN META_FASTER META_FASTER
self_adapt.yaml Self-adaptive attention for FSOD IDENTITY_NO_REPEAT GRU IDENTITY
dynamic.yaml Dynamic relevance learning IDENTITY INTERPOLATE DYNAMIC_R
dana.yaml Dual Awarness Attention for FSOD CISA BGA HADAMARD

The path to the AAF config file should be specified inside the master config file (i.e. for the whole network) under FEWSHOT.AAF.CFG.

For each module, classes implementing the available choices are regrouped under a single file: /modelling/aaf/alignment.py, /modelling/aaf/attention.py and /modelling/aaf/fusion.py.

Spatial Alignment

Spatial Alignment reorganizes spatially the features of one feature map to match another one. The idea is to align similar features in both maps so that comparison is easier.

Name Description
IDENTITY Repeats the feature to match BNCHW and NBCHW dimensions
IDENTITY_NO_REPEAT Identity without repetition
SIMILARITY_ALIGN Compute similarity matrix between support and query and align support to query accordingly.
CISA CISA block from this method

### Global Attention Global Attention highlights some features of a map accordingly to an attention vector computed globally on another one. The idea is to leverage global and hopefully semantic information.

Name Description
IDENTITY Simply pass features to next modules.
REWEIGHTING Reweights query features using globally pooled vectors from support.
REWEIGHTING_BATCH Same as above but support examples are the same for the whole batch.
SELF_ATTENTION Same as above but attention vectors are computed from the alignment matrix between query and support.
BGA BGA blocks from this method
META_FASTER Attention block from this method
POOLING Pools query and support features to the same size.
INTERPOLATE Upsamples support features to match query size.
GRU Computes attention vectors through a graph representation using a GRU.

Fusion Layer

Combine directly the features from support and query. These maps must be of the same dimension for point-wise operation. Hence fusion is often employed along with alignment.

Name Description
IDENTITY Returns onlu adapted query features.
ADD Point-wise sum between query and support features.
HADAMARD Point-wise multiplication between query and support features.
SUBSTRACT Point-wise substraction between query and support features.
CONCAT Channel concatenation of query and support features.
META_FASTER Fusion layer from this method
DYNAMIC_R Fusion layer from this method

Training and evaluation

Training and evaluation scripts are available.

TODO: Give code snippet to run training with a specified config file (modify main) Basically create 2 scripts train.py and eval.py with arg config file.

DataHandler

Explain DataHandler class a bit.

Installation

Dependencies used for this projects can be installed through conda create --name <env> --file requirements.txt. Please note that these requirements are not all necessary and it will be updated soon.

FCOS must be installed from sources. But there might be some issue after installation depending on the version of the python packages you use.

  • cpu/vision.h file not found: replace all occurences in the FCOS source by vision.h (see this issue).
  • Error related to AT_CHECK with pytorch > 1.5 : replace all occurences by TORCH_CHECK (see this issue.
  • Error related to torch._six.PY36: replace all occurence of PY36 by PY37.

Results

Results on pascal VOC, COCO and DOTA.

Owner
Pierre Le Jeune
PhD Student in Few-shot object detection.
Pierre Le Jeune
Code for the Active Speakers in Context Paper (CVPR2020)

Active Speakers in Context This repo contains the official code and models for the "Active Speakers in Context" CVPR 2020 paper. Before Training The c

43 Oct 14, 2022
Contains source code for the winning solution of the xView3 challenge

Winning Solution for xView3 Challenge This repository contains source code and pretrained models for my (Eugene Khvedchenya) solution to xView 3 Chall

Eugene Khvedchenya 51 Dec 30, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 07, 2022
Hardware accelerated, batchable and differentiable optimizers in JAX.

JAXopt Installation | Examples | References Hardware accelerated (GPU/TPU), batchable and differentiable optimizers in JAX. Installation JAXopt can be

Google 621 Jan 08, 2023
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
TianyuQi 10 Dec 11, 2022
Unsupervised Feature Ranking via Attribute Networks.

FRANe Unsupervised Feature Ranking via Attribute Networks (FRANe) converts a dataset into a network (graph) with nodes that correspond to the features

7 Sep 29, 2022
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2

Ilaria Manco 57 Dec 07, 2022
An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

hwanheelee 14 Nov 20, 2022
SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

242 Dec 20, 2022
A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX

Foolbox Native: Fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX Foolbox is a Python li

Bethge Lab 2.4k Dec 25, 2022
PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Łukasz Zalewski 2.1k Jan 09, 2023
Learning Time-Critical Responses for Interactive Character Control

Learning Time-Critical Responses for Interactive Character Control Abstract This code implements the paper Learning Time-Critical Responses for Intera

Movement Research Lab 227 Dec 31, 2022
TriMap: Large-scale Dimensionality Reduction Using Triplets

TriMap TriMap is a dimensionality reduction method that uses triplet constraints to form a low-dimensional embedding of a set of points. The triplet c

Ehsan Amid 235 Dec 24, 2022
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data Au

14 Nov 28, 2022
Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

MarioNette | Webpage | Paper | Video MarioNette: Self-Supervised Sprite Learning Dmitriy Smirnov, Michaël Gharbi, Matthew Fisher, Vitor Guizilini, Ale

Dima Smirnov 28 Nov 18, 2022
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provi

Qiulin Zhang 228 Dec 18, 2022
Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Caffe SegNet This is a modified version of Caffe which supports the SegNet architecture As described in SegNet: A Deep Convolutional Encoder-Decoder A

Alex Kendall 1.1k Jan 02, 2023