CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

Overview

CapsuleVOS

This is the code for the ICCV 2019 paper CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing.

Arxiv Link: https://arxiv.org/abs/1910.00132

The network is implemented using TensorFlow 1.4.1.

Python packages used: numpy, scipy, scikit-video

Files and their use

  1. caps_layers_cod.py: Contains the functions required to construct capsule layers - (primary, convolutional, and fully-connected, and conditional capsule routing).
  2. caps_network_train.py: Contains the CapsuleVOS model for training.
  3. caps_network_test.py: Contains the CapsuleVOS model for testing.
  4. caps_main.py: Contains the main function, which is called to train the network.
  5. config.py: Contains several different hyperparameters used for the network, training, or inference.
  6. inference.py: Contains the inference code.
  7. load_youtube_data_multi.py: Contains the training data-generator for YoutubeVOS 2018 dataset.
  8. load_youtubevalid_data.py: Contains the validation data-generator for YoutubeVOS 2018 dataset.

Data Used

We have supplied the code for training and inference of the model on the YoutubeVOS-2018 dataset. The file load_youtube_data_multi.py and load_youtubevalid_data.py creates two DataLoaders - one for training and one for validation. The data_loc variable at the top of each file should be set to the base directory which contains the frames and annotations.

To run this code, you need to do the following:

  1. Download the YoutubeVOS dataset
  2. Perform interpolation for the training frames following the papers' instructions

Training the Model

Once the data is set up you can train (and test) the network by calling python3 caps_main.py.

The config.py file contains several hyper-parameters which are useful for training the network.

Output File

During training and testing, metrics are printed to stdout as well as an output*.txt file. During training/validation, the losses and accuracies are printed out to the terminal and to an output file.

Saved Weights

Pretrained weights for the network are available here. To use them for inference, place them in the network_saves_best folder.

Inference

If you just want to test the trained model with the weights above, run the inference code by calling python3 inference.py. This code will read in an .mp4 file and a reference segmentation mask, and output the segmented frames of the video to the Output folder.

An example video is available in the Example folder.

Owner
PhD student at the Center for Research in Computer Vision
UFPR-ADMR-v2 Dataset

UFPR-ADMR-v2 Dataset The UFPR-ADMRv2 dataset contains 5,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), w

Gabriel Salomon 8 Sep 29, 2022
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) -- xmodel(xilinx

JiaKui Hu 10 Oct 28, 2022
Learning Open-World Object Proposals without Learning to Classify

Learning Open-World Object Proposals without Learning to Classify Pytorch implementation for "Learning Open-World Object Proposals without Learning to

Dahun Kim 149 Dec 22, 2022
Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

UPT: Unary–Pairwise Transformers This repository contains the official PyTorch implementation for the paper Frederic Z. Zhang, Dylan Campbell and Step

Frederic Zhang 109 Dec 20, 2022
yolov5 deepsort 行人 车辆 跟踪 检测 计数

yolov5 deepsort 行人 车辆 跟踪 检测 计数 实现了 出/入 分别计数。 默认是 南/北 方向检测,若要检测不同位置和方向,可在 main.py 文件第13行和21行,修改2个polygon的点。 默认检测类别:行人、自行车、小汽车、摩托车、公交车、卡车。 检测类别可在 detect

554 Dec 30, 2022
Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv MMCoref_cleaned Code for the MMCoref task of the SIMMC 2.0 dataset. Pre

Yichen (William) Huang 2 Dec 05, 2022
Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Introduction to Deep Learning course resources https://www.coursera.org/learn/intro-to-deep-learning Running on Google Colab (tested for all weeks) Go

Advanced Machine Learning specialisation by HSE 761 Dec 24, 2022
A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

AlphaFold Analyser This program produces high quality visualisations of predicted structures produced by AlphaFold. These visualisations allow the use

Oliver Powell 3 Nov 13, 2022
Unet network with mean teacher for altrasound image segmentation

Unet network with mean teacher for altrasound image segmentation

5 Nov 21, 2022
A collection of resources on GAN Inversion.

This repo is a collection of resources on GAN inversion, as a supplement for our survey

Adversarial Adaptation with Distillation for BERT Unsupervised Domain Adaptation

Knowledge Distillation for BERT Unsupervised Domain Adaptation Official PyTorch implementation | Paper Abstract A pre-trained language model, BERT, ha

Minho Ryu 29 Nov 30, 2022
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

AutoViz and Auto_ViML 397 Dec 30, 2022
This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

GANs for Fun Created because I can! GOAL The goal of this repo is to be freely used by ML devs to check the GAN performances without coding from scrat

Sagnik Roy 13 Jan 26, 2022
A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions Kapoutsis, A.C., Chatzichristofis,

Athanasios Ch. Kapoutsis 5 Oct 15, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

HackBMU-5.0-Team-Ctrl-Alt-Elite The search is over. We present to you ‘Health-A-

3 Feb 19, 2022
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

MoCoPnet: Exploring Local Motion and Contrast Priors for Infrared Small Target Super-Resolution Pytorch implementation of local motion and contrast pr

Xinyi Ying 28 Dec 15, 2022
Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Gradient Cache Gradient Cache is a simple technique for unlimitedly scaling contrastive learning batch far beyond GPU memory constraint. This means tr

Luyu Gao 198 Dec 29, 2022
Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

AI-OpenMic Dataset The dataset is available for download via the follwing link. Repository for code and dataset for our EMNLP 2021 paper - “So You Thi

6 Oct 26, 2022