CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

Last update: Oct 27, 2022

Related tags

Deep Learning CapsuleVOS

Overview

CapsuleVOS

This is the code for the ICCV 2019 paper CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing.

Arxiv Link: https://arxiv.org/abs/1910.00132

The network is implemented using TensorFlow 1.4.1.

Python packages used: numpy, scipy, scikit-video

Files and their use

caps_layers_cod.py: Contains the functions required to construct capsule layers - (primary, convolutional, and fully-connected, and conditional capsule routing).
caps_network_train.py: Contains the CapsuleVOS model for training.
caps_network_test.py: Contains the CapsuleVOS model for testing.
caps_main.py: Contains the main function, which is called to train the network.
config.py: Contains several different hyperparameters used for the network, training, or inference.
inference.py: Contains the inference code.
load_youtube_data_multi.py: Contains the training data-generator for YoutubeVOS 2018 dataset.
load_youtubevalid_data.py: Contains the validation data-generator for YoutubeVOS 2018 dataset.

Data Used

We have supplied the code for training and inference of the model on the YoutubeVOS-2018 dataset. The file load_youtube_data_multi.py and load_youtubevalid_data.py creates two DataLoaders - one for training and one for validation. The data_loc variable at the top of each file should be set to the base directory which contains the frames and annotations.

To run this code, you need to do the following:

Download the YoutubeVOS dataset
Perform interpolation for the training frames following the papers' instructions

Training the Model

Once the data is set up you can train (and test) the network by calling python3 caps_main.py.

The config.py file contains several hyper-parameters which are useful for training the network.

Output File

During training and testing, metrics are printed to stdout as well as an output*.txt file. During training/validation, the losses and accuracies are printed out to the terminal and to an output file.

Saved Weights

Pretrained weights for the network are available here. To use them for inference, place them in the network_saves_best folder.

Inference

If you just want to test the trained model with the weights above, run the inference code by calling python3 inference.py. This code will read in an .mp4 file and a reference segmentation mask, and output the segmented frames of the video to the Output folder.

An example video is available in the Example folder.

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

Related tags

Overview

CapsuleVOS

Files and their use

Data Used

Training the Model

Output File

Saved Weights

Inference

Owner

UFPR-ADMR-v2 Dataset

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

Learning Open-World Object Proposals without Learning to Classify

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

yolov5 deepsort 行人车辆跟踪检测计数

Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

Unet network with mean teacher for altrasound image segmentation

A collection of resources on GAN Inversion.

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

Mmdet benchmark with python

Deep Learning Models for Causal Inference

HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

Related tags

Overview

CapsuleVOS

Files and their use

Data Used

Training the Model

Output File

Saved Weights

Inference

Owner

UFPR-ADMR-v2 Dataset

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

Learning Open-World Object Proposals without Learning to Classify

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

yolov5 deepsort 行人 车辆 跟踪 检测 计数

Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

Unet network with mean teacher for altrasound image segmentation

A collection of resources on GAN Inversion.

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

Mmdet benchmark with python

Deep Learning Models for Causal Inference

HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

yolov5 deepsort 行人车辆跟踪检测计数