Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Last update: May 24, 2022

Related tags

Overview

Segmentation from Natural Language Expressions

This repository contains the code for the following paper:

R. Hu, M. Rohrbach, T. Darrell, Segmentation from Natural Language Expressions. in ECCV, 2016. (PDF)

@article{hu2016segmentation,
  title={Segmentation from Natural Language Expressions},
  author={Hu, Ronghang and Rohrbach, Marcus and Darrell, Trevor},
  journal={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2016}
}

Project Page: http://ronghanghu.com/text_objseg

Installation

Install Google TensorFlow (v1.0.0 or higher) following the instructions here.
Download this repository or clone with Git, and then cd into the root directory of the repository.

Demo

Download the trained models:
exp-referit/tfmodel/download_trained_models.sh.
Run the language-based segmentation model demo in ./demo/text_objseg_demo.ipynb with Jupyter Notebook (IPython Notebook).

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Download ReferIt dataset:
exp-referit/referit-dataset/download_referit_dataset.sh.
Download VGG-16 network parameters trained on ImageNET 1000 classes:
models/convert_caffemodel/params/download_vgg_params.sh.

Training

You may need to add the repository root directory to Python's module path: export PYTHONPATH=.:$PYTHONPATH.
Build training batches for bounding boxes:
python exp-referit/build_training_batches_det.py.
Build training batches for segmentation:
python exp-referit/build_training_batches_seg.py.
Select the GPU you want to use during training:
export GPU_ID=<gpu id>. Use 0 for <gpu id> if you only have one GPU on your machine.
Train the language-based bounding box localization model:
python exp-referit/exp_train_referit_det.py $GPU_ID.
Train the low resolution language-based segmentation model (from the previous bounding box localization model):
python exp-referit/init_referit_seg_lowres_from_det.py && python exp-referit/exp_train_referit_seg_lowres.py $GPU_ID.
Train the high resolution language-based segmentation model (from the previous low resolution segmentation model):
python exp-referit/init_referit_seg_highres_from_lowres.py && python exp-referit/exp_train_referit_seg_highres.py $GPU_ID.

Alternatively, you may skip the training procedure and download the trained models directly:
exp-referit/tfmodel/download_trained_models.sh.

Evaluation

Select the GPU you want to use during testing: export GPU_ID=<gpu id>. Use 0 for <gpu id> if you only have one GPU on your machine. Also, you may need to add the repository root directory to Python's module path: export PYTHONPATH=.:$PYTHONPATH.
Run evaluation for the high resolution language-based segmentation model:
python exp-referit/exp_test_referit_seg.py $GPU_ID
This should reproduce the results in the paper.
You may also evaluate the language-based bounding box localization model:
python exp-referit/exp_test_referit_det.py $GPU_ID
The results can be compared to this paper.

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Related tags

Overview

Segmentation from Natural Language Expressions

Installation

Demo

Training and evaluation on ReferIt Dataset

Download dataset and VGG network

Training

Evaluation

Owner

Ronghang Hu

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

Semantically Contrastive Learning for Low-light Image Enhancement

[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

pip install python-office

使用深度学习框架提取视频硬字幕；docker容器免安装深度学习库，使用本地api接口使得界面和后端识别分离；

Deep Q-network learning to play flappybird.

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

🛠️ SLAMcore SLAM Utilities

[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Transfer Learning Shootout for PyTorch's model zoo (torchvision)

An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Code for MSc Quantitative Finance Dissertation

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

pytorch implementation of trDesign

Curating a dataset for bioimage transfer learning

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Qlib is an AI-oriented quantitative investment platform