Yolov5+SlowFast: Realtime Action Detection Based on PytorchVideo

Last update: Dec 30, 2022

Related tags

Deep Learning yolo_slowfast

Overview

Yolov5+SlowFast: Realtime Action Detection

A realtime action detection frame work based on PytorchVideo.

Here are some details about our modification:

we choose yolov5 as an object detector instead of detectron2, it is faster and more convenient
we use a tracker(deepsort) to allocate action labels to all objects(with same ids) in different frames
our processing speed reached 24.2 FPS at 30 inference barch size (on a single RTX 2080Ti GPU)

Relevant infomation: FAIR/PytorchVideo; Ultralytics/Yolov5

Demo comparison betwween original(<-left) and ours(->right).

Installation

create a new python environment:
```
conda create -n env_name python=3.7.11
```
install requiments:
```
pip install -r requirements.txt
```
download weights file(ckpt.t7) from [deepsort] to this folder:
```
./deep_sort/deep_sort/deep/checkpoint/
```
test on your video:
```
python yolo_slowfast.py --input {path to your video}
```
The first time to execute this command may take some times to download the yolov5 code and it's weights file from torch.hub, keep your network connected.

References

Thanks for these great works:

[1] Ultralytics/Yolov5

[2] ZQPei/deepsort

[3] FAIR/PytorchVideo

[2] AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions. paper

[3] SlowFast Networks for Video Recognition. paper

Citation

If you find our work useful, please cite as follow:

{   yolo_slowfast,
    author = {Wu Fan},
    title = { A realtime action detection frame work based on PytorchVideo},
    year = {2021},
    url = {\url{https://github.com/wufan-tb/gmm_dae}}
}

Yolov5+SlowFast: Realtime Action Detection Based on PytorchVideo

Related tags

Overview

Yolov5+SlowFast: Realtime Action Detection

A realtime action detection frame work based on PytorchVideo.

Here are some details about our modification:

Demo comparison betwween original(<-left) and ours(->right).

Installation

References

Citation

Owner

WuFan

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

A PyTorch implementation of the continual learning experiments with deep neural networks

EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Command-line tool for downloading and extending the RedCaps dataset.

A curated list and survey of awesome Vision Transformers.

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Video-Music Transformer

Unifying Global-Local Representations in Salient Object Detection with Transformer

Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Massively parallel Monte Carlo diffusion MR simulator written in Python.

[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Implementation of U-Net and SegNet for building segmentation

[Link]deep_portfolo - Use Reforcemet earg ad Supervsed learg to Optmze portfolo allocato []

[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

使用深度学习框架提取视频硬字幕；docker容器免安装深度学习库，使用本地api接口使得界面和后端识别分离；