[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Last update: Oct 26, 2022

Related tags

Deep Learning SSVC

Overview

SSVC

The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning]

samples of the generated motion-preserved video with threshold $\alpha=0.5$.

Requirements

python3
torch1.1+
PIL
FrEIA==0.2 (Flow-based model)
lintel==1.0 (Decode mp4 videos on the fly)

Structure

backbone
data
- lists: train/val lists (.txt)
- augmentation.py: train/val data augmentation during ssl pre-training
- vDataLoader.py: custom your path to data list
model
- advflow: flow-based model
- classifier.py: linear classifier for down-stream tasks
- infonce.py: combine S$^2$VC with MoCo
flow
- pre-trained flow-based model weights
utils
main_pretrain.py: the main function for self-supervised pretrain
main_eval.py: the main function for supervised fine-tune

Self-supervised Pretrain

DDP

python -m torch.distributed.launch --nproc_per_node=1 --master_port 1234 main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0,1 --dataset XX

Single GPU

python main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0 --dataset XX

Evaluation

NN-Retrieval

python main_eval.py --retrieval --test SSL_Pt_Model_PTH --dataset XX --gpu X

Finetune

# fine-tune overall model
python main_eval.py --train_what ft --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

# freeze backbone, finetune last layer
python main_eval.py --train_what last --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

Test

python main_eval.py --train_what XX --ten_crop --test Sup_Ft_Model_PTH --gpu X \
--dataset XX --net r3d18 --img_dim 224 --seq_len 32

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Related tags

Overview

SSVC

Requirements

Structure

Self-supervised Pretrain

DDP

Single GPU

Evaluation

NN-Retrieval

Finetune

Test

Owner

[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

Semi-Supervised Learning for Fine-Grained Classification

A list of all named GANs!

Object Detection Projekt in GKI WS2021/22

BTC-Generator - BTC Generator With Python

Self-supervised Label Augmentation via Input Transformations (ICML 2020)

Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

Model serving at scale

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Boosted neural network for tabular data

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Another pytorch implementation of FCN (Fully Convolutional Networks)

A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

A fast Evolution Strategy implementation in Python