LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Last update: Oct 11, 2022

Related tags

Overview

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

This Repository contains the code on AVA of our ACM MM 2021 paper: LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Installation

See INSTALL.md for details on installing the codebase, including requirement and environment settings

Data

For data preparation and setup, our LSTC strictly follows the processing of PySlowFast, See DATASET.md for details on preparing the data.

Run the code

We take SlowFast-ResNet50 as an example

train the model

python3 tools/run_net.py --cfg config/AVA/SLOWFAST_32x12_R50_LFB.yaml \
    AVA.FEATURE_BANK_PATH 'path/to/feature/bank/folder' \
    TRAIN.CHECKPOINT_FILE_PATH 'path/to/pretrained/backbone' \
    OUTPUT_DIR 'path/to/output/folder'

test the model

python3 tools/run_net.py --cfg config/AVA/SLOWFAST_32x12_R50_LFB.yaml \
    AVA.FEATURE_BANK_PATH 'path/to/feature/bank/folder' \
    OUTPUT_DIR 'path/to/output/folder' \
    TRAIN.ENABLE False \ 
    TEST.ENABLE True

If you want to start the DDP training from command line with torch.distributed.launch, please set start_method='cmd' in tools/run_net.py

Resource

The codebase provide following resources for fast training and validation

Pretrained backbone on Kinetics

backbone	dataset	model type	link
ResNet50	Kinetics400	Caffe2	Google Drive/Baidu Disk (Code: y1wl)
ResNet101	Kinetics600	Caffe2	Google Drive/Baidu Disk (Code: slde)

Extracted long term feature bank

backbone	feature bank (LMDB)	dimension
ResNet50	Google Drive	1280
ResNet101	Google Drive	2304

Checkpoint file

backbone	checkpoint	model type
ResNet50	Google Drive/Baidu Disk (Code: fi0s)	pytorch
ResNet101	Google Drive/Baidu Disk (Code: g63o)	pytorch

Acknowledgement

This codebase is built upon PySlowFast.

Citation

If you find this repository helps your research, please refer following paper

@InProceedings{Yuxi_2021_ACM,
  author = {Li, Yuxi and Zhang, Boshen and Li, Jian and Wang, Yabiao and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Lin, Weiyao},
  title = {LSTC: Boosting Atomic Action Detection with Long-Short-Term Context},
  booktitle = {ACM Conference on Multimedia},
  month = {October},
  year = {2021}
}

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Related tags

Overview

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Installation

Data

Run the code

Resource

Pretrained backbone on Kinetics

Extracted long term feature bank

Checkpoint file

Acknowledgement

Citation

Owner

Tencent YouTu Research

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

Large-scale pretraining for dialogue

Summarization module based on KoBART

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Unofficial PyTorch implementation of Google AI's VoiceFilter system

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

A python script that will use hydra to get user and password to login to ssh, ftp, and telnet

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

A minimal Conformer ASR implementation adapted from ESPnet.

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results

Utilize Korean BERT model in sentence-transformers library

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

Snowball compiler and stemming algorithms

Comprehensive-E2E-TTS - PyTorch Implementation

Training RNNs as Fast as CNNs

Switch spaces for knowledge graph embeddings

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.