training script for space time memory network

Last update: Dec 20, 2022

Related tags

Deep Learning STM-Training

Overview

Trainig Script for Space Time Memory Network

This codebase implemented training code for Space Time Memory Network with some cyclic features.

Requirement

python package

torch
python-opencv
pillow
yaml
imgaug
yacs
progress
nvidia-dali (optional)

GPU support

GPU Memory >= 12GB
CUDA >= 10.0

Data

See the doc DATASET.md for more details on data organization of our prepared dataset.

Release

We provide pre-trained model with different backbone in our codebase, results are validated on DAVIS17-val with gradient correction.

model	backbone	data backend	J	F	J & F	link	FPS
STM-Cycle	Resnet18	DALI	65.3	70.8	68.1	Google Drive	14.8
STM-Cycle	Resnet50	PIL	70.5	76.3	73.4	Google Drive	9.3

Runing

Appending the root folder to the search path of python interpreter

export PYTHONPATH=${PYTHONPATH}:./

To train the STM network, run following command.

python3 train.py --cfg config.yaml OPTION_KEY OPTION_VAL

To test the STM network, run following command

python3 test.py --cfg config.yaml initial ${PATH_TO_MODEL} OPTION_KEY OPTION_VAL

The test results will be saved as indexed png file at ${ROOT}/${output_dir}/${valset}.

To run a segmentation demo, run following command

python3 demo/demo.py --cfg demo/demo.yaml OPTION_KEY OPTION_VAL

The segmentation results will be saved at ${output_dir}.

Acknowledgement

This codebase borrows the code and structure from official STM repository

Reference

The codebase is built based on following works

@InProceedings{Oh_2019_ICCV,
author = {Oh, Seoung Wug and Lee, Joon-Young and Xu, Ning and Kim, Seon Joo},
title = {Video Object Segmentation Using Space-Time Memory Networks},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

@InProceedings{Li_2020_NeurIPS,
author = {Li, Yuxi and Xu, Ning and Peng Jinlong and John See and Lin Weiyao},
title = {Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation},
booktitle = {Neural Information Processing System (NeurIPS)},
year = {2020}
}

training script for space time memory network

Related tags

Overview

Trainig Script for Space Time Memory Network

Requirement

python package

GPU support

Data

Release

Runing

Acknowledgement

Reference

Owner

Yuxi Li

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Code To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment.

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

RID-Noise: Towards Robust Inverse Design under Noisy Environments

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Decorators for maximizing memory utilization with PyTorch & CUDA

[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Conflict-aware Inference of Python Compatible Runtime Environments with Domain Knowledge Graph, ICSE 2022

Contrastive Language-Image Pretraining

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Experiments on continual learning from a stream of pretrained models.

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation