Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Last update: Jan 01, 2023

Related tags

Overview

Training Script for Reuse-VOS

This code implementation of CVPR 2021 paper : Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Hard case (Ours, FRTM)

(Ours)

(FRTM)

Easy case (Ours, FRTM)

(Ours)

(FRTM)

Requirement

python package

torch
python-opencv
skimage
easydict

GPU support

GPU Memory >= 11GB (RN18)
CUDA >= 10.0
pytorch >= 1.4.0

Datasets

DAVIS

To test the DAVIS validation split, download and unzip the 2017 480p trainval images and annotations here.

/path/DAVIS
|-- Annotations/
|-- ImageSets/
|-- JPEGImages/

YouTubeVOS

To test our validation split and the YouTubeVOS challenge 'valid' split, download YouTubeVOS 2018 and place it in this directory structure:

/path/ytvos2018
|-- train/
|-- train_all_frames/
|-- valid/
`-- valid_all_frames/

Release

DAVIS

model	Backbone	Training set	J & F 17	J & F 16	link
G-FRTM (t=1)	Resnet18	Youtube-VOS + DAVIS	71.7	80.9	Google Drive
G-FRTM (t=0.7)	Resnet18	Youtube-VOS + DAVIS	69.9	80.5	same pth
G-FRTM (t=1)	Resnet101	Youtube-VOS + DAVIS	76.4	84.3	Google Drive
G-FRTM (t=0.7)	Resnet101	Youtube-VOS + DAVIS	74.3	82.3	same pth

Youtube-VOS

model	Backbone	Training set	G	J-S	J-Us	F-S	F-Us	link
G-FRTM (t=1)	Resnet18	Youtube-VOS	63.8	68.3	55.2	70.6	61.0	Google Drive
G-FRTM (t=0.8)	Resnet18	Youtube-VOS	63.4	67.6	55.8	69.3	60.9	same pth
G-FRTM (t=0.7)	Resnet18	Youtube-VOS	62.7	67.1	55.2	68.2	60.1	same pth

We initialize orignal-FRTM layers from official FRTM repository weight for Youtube-VOS benchmark. S = Seen, Us = Unseen

Target model cache

Here is the cache file we used for ResNet18 file

Run

Train

Open train.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python train.py --name <session-name> --ftext resnet18 --dset all --dev cuda:0

--name is the name of save_dir name of current train --ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2017, ytvos2018 or all ("all" really means "both"). --dev is the name of the device to train on. --m1 is the margin1 for training reuse gate, and we use 1.0 for DAVIS benchmark and 0.5 for Youtube-VOS benchmark. --m2 is the margin2 for training reuse gate, and we use 0.

Replace "session-name" with whatever you like. Subdirectories with this name will be created under your checkpoint and tensorboard paths.

Eval

Open eval.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python evaluate.py --ftext resnet18 --dset dv2017val --dev cuda:0

--ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2016val, dv2017val, yt2018jjval, yt2018val or yt2018valAll --dev is the name of the device to eval on. --TH Threshold for tau default= 0.7

The inference results will be saved at ${ROOT}/${result} . It is better to check multiple pth file for good accuracy.

Acknowledgement

This codebase borrows the code and structure from official FRTM repository. We are grateful to Facebook Inc. with valuable discussions.

Reference

The codebase is built based on following works

@misc{park2020learning,
      title={Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation}, 
      author={Hyojin Park and Jayeon Yoo and Seohyeong Jeong and Ganesh Venkatesh and Nojun Kwak},
      year={2020},
      eprint={2012.11655},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Related tags

Overview

Training Script for Reuse-VOS

Requirement

python package

GPU support

Datasets

DAVIS

YouTubeVOS

Release

DAVIS

Youtube-VOS

Target model cache

Run

Train

Eval

Acknowledgement

Reference

Owner

HYOJINPARK

Repo for flood prediction using LSTMs and HAND

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

learned_optimization: Training and evaluating learned optimizers in JAX

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

The FIRST GANs-based omics-to-omics translation framework

MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets)

An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

Official PyTorch Implementation of Learning Architectures for Binary Networks

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

pytorch implementation of openpose including Hand and Body Pose Estimation.

Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Repositório criado para abrigar os notebooks com a listas de exercícios propostos pelo professor Gustavo Guanabara do canal Curso em Vídeo do YouTube durante o Curso de Python 3

Object detection GUI based on PaddleDetection

Instance-based label smoothing for improving deep neural networks generalization and calibration

Code for "Causal autoregressive flows" - AISTATS, 2021