Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"

Last update: Oct 16, 2022

Related tags

Deep Learning JOINT

Overview

JOINT

This is the official implementation of Joint Inductive and Transductive learning for Video Object Segmentation, to appear in ICCV 2021.

@inproceedings{joint_iccv_2021,
  title={Joint Inductive and Transductive Learning for Video Object Segmentation},
  author={Yunyao Mao, Ning Wang, Wengang Zhou, Houqiang Li},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month = {October},
  year={2021}
}

Installation

Clone this repository

git clone https://github.com/maoyunyao/JOINT.git

Install dependencies

Please check the detailed installation instructions.

Training

The whole network is trained with 8 NVIDIA GTX 1080Ti GPUs

conda activate pytracking
cd ltr
python run_training.py joint joint_stage1  # stage 1
python run_training.py joint joint_stage2  # stage 2

Note: We initialize the backbone ResNet with pre-trained Mask-RCNN weights as in LWL. These weights can be obtained from here. Before training, you need to download and save these weights in env_settings().pretrained_networks directory.

Evaluation

conda activate pytracking
cd pytracking
python run_tracker.py joint joint_davis --dataset_name dv2017_val        # DAVIS 2017 Val
python run_tracker.py joint joint_ytvos --dataset_name yt2018_valid_all  # YouTube-VOS 2018 Val
python run_tracker.py joint joint_ytvos --dataset_name yt2019_valid_all  # YouTube-VOS 2019 Val

Note: Before evaluation, the pretrained networks (see model zoo) should be downloaded and saved into the directory set by "network_path" in "pytracking/evaluation/local.py". By default, it is set to pytracking/networks.

Model Zoo

Models

Model	YouTube-VOS 2018 (Overall Score)	YouTube-VOS 2019 (Overall Score)	DAVIS 2017 val (J&F score)	Links	Raw Results
JOINT_ytvos	83.1	82.8	--	model	results
JOINT_davis	--	--	83.5	model	results

Acknowledgments

Our JOINT segmentation tracker is implemented based on pytracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing such a great framework.
We adopt the few-shot learner proposed in LWL as the Induction branch.

Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"

Related tags

Overview

JOINT

Installation

Clone this repository

Install dependencies

Training

Evaluation

Model Zoo

Models

Acknowledgments

Owner

Yunyao

ML-Decoder: Scalable and Versatile Classification Head

Notepy is a full-featured Notepad Python app

Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

An Image compression simulator that uses Source Extractor and Monte Carlo methods to examine the post compressive effects different compression algorithms have.

Implementation of "Large Steps in Inverse Rendering of Geometry"

Source Code for ICSE 2022 Paper - ``Can We Achieve Fairness Using Semi-Supervised Learning?''

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling @ INTERSPEECH 2021 Accepted

Command-line tool for downloading and extending the RedCaps dataset.

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

C3d-pytorch - Pytorch porting of C3D network, with Sports1M weights

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

Object-aware Contrastive Learning for Debiased Scene Representation

Video Frame Interpolation without Temporal Priors (a general method for blurry video interpolation)

RLDS stands for Reinforcement Learning Datasets

Animation of solving the traveling salesman problem to optimality using mixed-integer programming and iteratively eliminating sub tours