This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Last update: Sep 24, 2022

Related tags

Overview

Elaborative Rehearsal for Zero-shot Action Recognition

This is an official implementation of:

Shizhe Chen and Dong Huang, Elaborative Rehearsal for Zero-shot Action Recognition, ICCV, 2021. Arxiv Version

Elaborating a new concept and relating it to known concepts, we reach the dawn of zero-shot action recognition models being comparable to supervised models trained on few samples.

New SOTA results are also achieved on the standard ZSAR benchmarks (Olympics, HMDB51, UCF101) as well as the first large scale ZSAR benchmak (we proposed) on the Kinetics database.

Installation

git clone https://github.com/DeLightCMU/ElaborativeRehearsal.git
cd ElaborativeRehearsal
export PYTHONPATH=$(pwd):${PYTHONPATH}

pip install -r requirements.txt

# download pretrained models
bash scripts/download_premodels.sh

Zero-shot Action Recognition (ZSAR)

Extract Features in Video

spatial-temporal features

bash scripts/extract_tsm_features.sh '0,1,2'

object features

bash scripts/extract_object_features.sh '0,1,2'

ZSAR Training and Inference

Baselines: DEVISE, ALE, SJE, DEM, ESZSL and GCN.

# mtype: devise, ale, sje, dem, eszsl
mtype=devise
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} --is_train
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} --eval_set tst
# evaluate other splits
ksplit=1
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_baselines_eval_splits.py zeroshot/configs/zsl_baseline_${mtype}_config.yaml ${mtype} ${ksplit}

# gcn
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_kgraphs.py zeroshot/configs/zsl_baseline_kgraph_config.yaml --is_train
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_kgraphs.py zeroshot/configs/zsl_baseline_kgraph_config.yaml --eval_set tst

ER-ZSAR and ablations:

# TSM + ED class representation + AttnPool (2nd row in Table 4(b))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_vse.py zeroshot/configs/zsl_vse_wordembed_config.yaml --is_train --resume_file datasets/Kinetics/zsl220/word.glove42b.th

# TSM + ED class representation + BERT (last row in Table 4(a) and Table 4(b))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_vse.py zeroshot/configs/zsl_vse_config.yaml --is_train

# Obj + ED class representation + BERT + ER Loss (last row in Table 4(c))
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_cptembed.py zeroshot/configs/zsl_cpt_config.yaml --is_train

# ER-ZSAR Full Model
CUDA_VISIBLE_DEVICES=0 python zeroshot/driver/zsl_ervse.py zeroshot/configs/zsl_ervse_config.yaml --is_train

Citation

If you find this repository useful, please cite our paper:

@proceeding{ChenHuang2021ER,
  title={Elaborative Rehearsal for Zero-shot Action Recognition},
  author={Shizhe Chen and Dong Huang},
  booktitle = {ICCV},
  year={2021}
}

This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Related tags

Overview

Elaborative Rehearsal for Zero-shot Action Recognition

Installation

Zero-shot Action Recognition (ZSAR)

Extract Features in Video

ZSAR Training and Inference

Citation

Acknowledgement

Owner

DeLightCMU

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Commonsense Ability Tests

The Multi-Mission Maximum Likelihood framework (3ML)

CrossMLP - The repository offers the official implementation of our BMVC 2021 paper (oral) in PyTorch.

Flower - A Friendly Federated Learning Framework

A rule learning algorithm for the deduction of syndrome definitions from time series data.

Coursera - Quiz & Assignment of Coursera

Object Depth via Motion and Detection Dataset

C3D is a modified version of BVLC caffe to support 3D ConvNets.

[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

Emotion classification of online comments based on RNN

Rasterize with the least efforts for researchers.

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

A high-level Python library for Quantum Natural Language Processing

Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

This code is 3d-CNN model that can predict environmental value

Steer OpenAI's Jukebox with Music Taggers

ML-Decoder: Scalable and Versatile Classification Head

Deep Latent Force Models