YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

Last update: Jan 01, 2023

Related tags

Deep Learning yoltv5

Overview

YOLTv5

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

YOLTv5 builds upon YOLT and SIMRDWN, and updates these frameworks to use the YOLOv5 version of the YOLO object detection family. This repository has generally similar performance to the Darknet-based YOLTv4 repository. For those users who prefer a PyTorch backend, however, we provide YOLTv5.

Below, we provide examples of how to use this repository with the open-source SpaceNet dataset.

Running YOLTv5

0. Installation (Preliminary)

YOLTv5 is built to execute on a GPU-enabled machine.

cd yoltv5/yolov5
pip install -r requirements.txt 

# update with geo packages
conda install -c conda-forge gdal
conda install -c conda-forge osmnx=0.12 
conda install  -c conda-forge scikit-image
conda install  -c conda-forge statsmodels
pip install torchsummary
pip install utm
pip install numba
pip install jinja2==2.10

1. Train

Training preparation is accomplished via prep_train.py. To train a model, run:

cd /yoltv5
python yolov5/train.py --img 640 --batch 16 --epochs 100 --data yoltv5_train_vehicles_8cat.yaml --weights yolov5l.pt

2. Test

Simply edit yoltv5_test_vehicles_8cat.yaml to point to the appropriate locations, then run the test.sh script:

cd yoltv5
./test.sh ../configs/yoltv5_test_vehicles_8cat.yaml

Outputs will look something like the figure below:

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

Related tags

Overview

YOLTv5

Running YOLTv5

0. Installation (Preliminary)

1. Train

2. Test

Owner

Adam Van Etten

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Code for Towards Streaming Perception (ECCV 2020) :car:

Script for getting information in discord

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Gluon CV Toolkit

Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

Implement some metaheuristics and cost functions

Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI

KIDA: Knowledge Inheritance in Data Aggregation

AITUS - An atomatic notr maker for CYTUS

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

Auto White-Balance Correction for Mixed-Illuminant Scenes

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

Related tags

Overview

YOLTv5

Running YOLTv5

0. Installation (Preliminary)

1. Train

2. Test

Owner

Adam Van Etten

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Code for Towards Streaming Perception (ECCV 2020) :car:

Script for getting information in discord

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Gluon CV Toolkit

Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

Implement some metaheuristics and cost functions

Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

This is the pytorch implementation for the paper: *Learning Accurate Performance Predictors for Ultrafast Automated Model Compression*, which is in submission to TPAMI

KIDA: Knowledge Inheritance in Data Aggregation

AITUS - An atomatic notr maker for CYTUS

The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

Auto White-Balance Correction for Mixed-Illuminant Scenes

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI