PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Last update: Sep 14, 2022

Related tags

Overview

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

@misc{CV2018,
  author =       {Donny You ([email protected])},
  howpublished = {\url{https://github.com/donnyyou/PyTorchCV}},
  year =         {2018}
}

This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.

Implemented Papers

Image Classification
- VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
- ResNet: Deep Residual Learning for Image Recognition
- DenseNet: Densely Connected Convolutional Networks
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
- ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
Semantic Segmentation
- DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
- PSPNet: Pyramid Scene Parsing Network
- DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
Object Detection
- SSD: Single Shot MultiBox Detector
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- YOLOv3: An Incremental Improvement
- FPN: Feature Pyramid Networks for Object Detection
Pose Estimation
- CPM: Convolutional Pose Machines
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Instance Segmentation
- Mask R-CNN

Performances with PyTorchCV

Image Classification

ResNet: Deep Residual Learning for Image Recognition

Semantic Segmentation

PSPNet: Pyramid Scene Parsing Network

Model	Backbone	Training data	Testing data	mIOU	Pixel Acc	Setting
PSPNet Origin	3x3-ResNet101	ADE20K train	ADE20K val	41.96	80.64	-
PSPNet Ours	7x7-ResNet101	ADE20K train	ADE20K val	44.18	80.91	PSPNet

Object Detection

SSD: Single Shot MultiBox Detector

Model	Backbone	Training data	Testing data	mAP	FPS	Setting
SSD-300 Origin	VGG16	VOC07+12 trainval	VOC07 test	0.772	-	-
SSD-300 Ours	VGG16	VOC07+12 trainval	VOC07 test	0.786	-	SSD300
SSD-512 Origin	VGG16	VOC07+12 trainval	VOC07 test	0.798	-	-
SSD-512 Ours	VGG16	VOC07+12 trainval	VOC07 test	0.808	-	SSD512

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Model	Backbone	Training data	Testing data	mAP	FPS	Setting
Faster R-CNN Origin	VGG16	VOC07 trainval	VOC07 test	0.699	-	-
Faster R-CNN Ours	VGG16	VOC07 trainval	VOC07 test	0.706	-	Faster R-CNN

YOLOv3: An Incremental Improvement

Pose Estimation

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Instance Segmentation

Mask R-CNN

Commands with PyTorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

Training

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag

Resume Training

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag

Validate

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag

Testing:

cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag

Examples with PyTorchCV

Example output of VGG19-OpenPose

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Related tags

Overview

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

Implemented Papers

Performances with PyTorchCV

Image Classification

Semantic Segmentation

Object Detection

Pose Estimation

Instance Segmentation

Commands with PyTorchCV

Examples with PyTorchCV

Owner

Donny You

GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

Uses OpenCV and Python Code to detect a face on the screen

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

The Submission for SIMMC 2.0 Challenge 2021

Reinforcement Learning via Supervised Learning

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Puzzle-CAM: Improved localization via matching partial and full features.

NeurIPS-2021: Neural Auto-Curricula in Two-Player Zero-Sum Games.

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

PyTorch Implementation of Region Similarity Representation Learning (ReSim)