Pytorch-YOLOv4

A minimal PyTorch implementation of YOLOv4.

Paper Yolo v4: https://arxiv.org/abs/2004.10934
Source code:https://github.com/AlexeyAB/darknet
More details: http://pjreddie.com/darknet/yolo/
Inference
Train
- Mocaic

├── README.md
├── dataset.py            dataset
├── demo.py               demo to run pytorch --> tool/darknet2pytorch
├── demo_darknet2onnx.py  tool to convert into onnx --> tool/darknet2pytorch
├── demo_pytorch2onnx.py  tool to convert into onnx
├── models.py             model for pytorch
├── train.py              train models.py
├── cfg.py                cfg.py for train
├── cfg                   cfg --> darknet2pytorch
├── data            
├── weight                --> darknet2pytorch
├── tool
│   ├── camera.py           a demo camera
│   ├── coco_annotation.py       coco dataset generator
│   ├── config.py
│   ├── darknet2pytorch.py
│   ├── region_loss.py
│   ├── utils.py
│   └── yolo_layer.py

0. Weights Download

0.1 darknet

baidu(https://pan.baidu.com/s/1dAGEW8cm-dqK14TbhhVetA Extraction code:dm5b)
google(https://drive.google.com/open?id=1cewMfusmPjYWbrnuJRuKhPMwRe_b9PaT)

0.2 pytorch

you can use darknet2pytorch to convert it yourself, or download my converted model.

baidu
- yolov4.pth(https://pan.baidu.com/s/1ZroDvoGScDgtE1ja_QqJVw Extraction code:xrq9)
- yolov4.conv.137.pth(https://pan.baidu.com/s/1ovBie4YyVQQoUrC3AY0joA Extraction code:kcel)
google
- yolov4.pth(https://drive.google.com/open?id=1wv_LiFeCRYwtpkqREPeI13-gPELBDwuJ)
- yolov4.conv.137.pth(https://drive.google.com/open?id=1fcbR0bWzYfIEdLJPzOsn4R5mlvR6IQyA)

1. Train

use yolov4 to train your own data

Download weight

Transform data

For coco dataset,you can use tool/coco_annotation.py.

# train.txt
image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id x1,y1,x2,y2,id ...
...
...

Train

you can set parameters in cfg.py.

 python train.py -g [GPU_ID] -dir [Dataset direction] ...

2. Inference

2.1 Performance on MS COCO dataset (using pretrained DarknetWeights from https://github.com/AlexeyAB/darknet)

ONNX and TensorRT models are converted from Pytorch (TianXiaomo): Pytorch->ONNX->TensorRT. See following sections for more details of conversions.

val2017 dataset (input size: 416x416)

Model type	AP	AP50	AP75	APS	APM	APL
DarkNet (YOLOv4 paper)	0.471	0.710	0.510	0.278	0.525	0.636
Pytorch (TianXiaomo)	0.466	0.704	0.505	0.267	0.524	0.629
TensorRT FP32 + BatchedNMSPlugin	0.472	0.708	0.511	0.273	0.530	0.637
TensorRT FP16 + BatchedNMSPlugin	0.472	0.708	0.511	0.273	0.530	0.636

testdev2017 dataset (input size: 416x416)

Model type	AP	AP50	AP75	APS	APM	APL
DarkNet (YOLOv4 paper)	0.412	0.628	0.443	0.204	0.444	0.560
Pytorch (TianXiaomo)	0.404	0.615	0.436	0.196	0.438	0.552
TensorRT FP32 + BatchedNMSPlugin	0.412	0.625	0.445	0.200	0.446	0.564
TensorRT FP16 + BatchedNMSPlugin	0.412	0.625	0.445	0.200	0.446	0.563

2.2 Image input size for inference

Image input size is NOT restricted in 320 * 320, 416 * 416, 512 * 512 and 608 * 608. You can adjust your input sizes for a different input ratio, for example: 320 * 608. Larger input size could help detect smaller targets, but may be slower and GPU memory exhausting.

height = 320 + 96 * n, n in {0, 1, 2, 3, ...}
width  = 320 + 96 * m, m in {0, 1, 2, 3, ...}

2.3 Different inference options

Load the pretrained darknet model and darknet weights to do the inference (image size is configured in cfg file already)
```
python demo.py -cfgfile <cfgFile> -weightfile <weightFile> -imgfile <imgFile>
```

Load pytorch weights (pth file) to do the inference

python models.py <num_classes> <weightfile> <imgfile> <IN_IMAGE_H> <IN_IMAGE_W> <namefile(optional)>

Load converted ONNX file to do inference (See section 3 and 4)
Load converted TensorRT engine file to do inference (See section 5)

2.4 Inference output

There are 2 inference outputs.

One is locations of bounding boxes, its shape is [batch, num_boxes, 1, 4] which represents x1, y1, x2, y2 of each bounding box.
The other one is scores of bounding boxes which is of shape [batch, num_boxes, num_classes] indicating scores of all classes for each bounding box.

Until now, still a small piece of post-processing including NMS is required. We are trying to minimize time and complexity of post-processing.

3. Darknet2ONNX

This script is to convert the official pretrained darknet model into ONNX
Pytorch version Recommended:
- Pytorch 1.4.0 for TensorRT 7.0 and higher
- Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
Install onnxruntime
```
pip install onnxruntime
```

Run python script to generate ONNX model and run the demo

python demo_darknet2onnx.py <cfgFile> <weightFile> <imageFile> <batchSize>

3.1 Dynamic or static batch size

Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
- Dynamic batch size will generate only one ONNX model
- Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)

4. Pytorch2ONNX

You can convert your trained pytorch model into ONNX using this script
Pytorch version Recommended:
- Pytorch 1.4.0 for TensorRT 7.0 and higher
- Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
Install onnxruntime
```
pip install onnxruntime
```

Run python script to generate ONNX model and run the demo

python demo_pytorch2onnx.py <weight_file> <image_path> <batch_size> <n_classes> <IN_IMAGE_H> <IN_IMAGE_W>

For example:

python demo_pytorch2onnx.py yolov4.pth dog.jpg 8 80 416 416

4.1 Dynamic or static batch size

Positive batch size will generate ONNX model of static batch size, otherwise, batch size will be dynamic
- Dynamic batch size will generate only one ONNX model
- Static batch size will generate 2 ONNX models, one is for running the demo (batch_size=1)

5. ONNX2TensorRT

TensorRT version Recommended: 7.0, 7.1

5.1 Convert from ONNX of static Batch size

Run the following command to convert YOLOv4 ONNX model into TensorRT engine
```
trtexec --onnx=<onnx_file> --explicitBatch --saveEngine=<tensorRT_engine_file> --workspace=<size_in_megabytes> --fp16
```
- Note: If you want to use int8 mode in conversion, extra int8 calibration is needed.

5.2 Convert from ONNX of dynamic Batch size

Run the following command to convert YOLOv4 ONNX model into TensorRT engine

trtexec --onnx=<onnx_file> \
--minShapes=input:<shape_of_min_batch> --optShapes=input:<shape_of_opt_batch> --maxShapes=input:<shape_of_max_batch> \
--workspace=<size_in_megabytes> --saveEngine=<engine_file> --fp16

For example:

trtexec --onnx=yolov4_-1_3_320_512_dynamic.onnx \
--minShapes=input:1x3x320x512 --optShapes=input:4x3x320x512 --maxShapes=input:8x3x320x512 \
--workspace=2048 --saveEngine=yolov4_-1_3_320_512_dynamic.engine --fp16

5.3 Run the demo

python demo_trt.py <tensorRT_engine_file> <input_image> <input_H> <input_W>

This demo here only works when batchSize is dynamic (1 should be within dynamic range) or batchSize=1, but you can update this demo a little for other dynamic or static batch sizes.
Note1: input_H and input_W should agree with the input size in the original ONNX file.
Note2: extra NMS operations are needed for the tensorRT output. This demo uses python NMS code from tool/utils.py.

6. ONNX2Tensorflow

First:Conversion to ONNX

tensorflow >=2.0

1: Thanks:github:https://github.com/onnx/onnx-tensorflow

2: Run git clone https://github.com/onnx/onnx-tensorflow.git && cd onnx-tensorflow Run pip install -e .

Note:Errors will occur when using "pip install onnx-tf", at least for me,it is recommended to use source code installation

7. ONNX2TensorRT and DeepStream Inference

Compile the DeepStream Nvinfer Plugin

    cd DeepStream
    make

Build a TRT Engine.

For single batch,

trtexec --onnx= --explicitBatch --saveEngine= --workspace= --fp16

For multi-batch,

trtexec --onnx= --explicitBatch --shapes=input:Xx3xHxW --optShapes=input:Xx3xHxW --maxShapes=input:Xx3xHxW --minShape=input:1x3xHxW --saveEngine= --fp16

Note :The maxShapes could not be larger than model original shape.

Write the deepstream config file for the TRT Engine.

Reference:

@article{yolov4,
  title={YOLOv4: YOLOv4: Optimal Speed and Accuracy of Object Detection},
  author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  journal = {arXiv},
  year={2020}
}

PyTorch ,ONNX and TensorRT implementation of YOLOv4

Related tags

Overview

Pytorch-YOLOv4

0. Weights Download

0.1 darknet

0.2 pytorch

1. Train

2. Inference

2.1 Performance on MS COCO dataset (using pretrained DarknetWeights from https://github.com/AlexeyAB/darknet)

2.2 Image input size for inference

2.3 Different inference options

2.4 Inference output

3. Darknet2ONNX

3.1 Dynamic or static batch size

4. Pytorch2ONNX

4.1 Dynamic or static batch size

5. ONNX2TensorRT

5.1 Convert from ONNX of static Batch size

5.2 Convert from ONNX of dynamic Batch size

5.3 Run the demo

6. ONNX2Tensorflow

7. ONNX2TensorRT and DeepStream Inference

Owner

[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation

Neural Cellular Automata + CLIP

Temporal Segment Networks (TSN) in PyTorch

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

A module that used for encrypt code which includes RSA and AES

PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"

End-to-End Speech Processing Toolkit

Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

details on efforts to dump the Watermelon Games Paprium cart

Towards Long-Form Video Understanding

The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer

[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Music Classification: Beyond Supervised Learning, Towards Real-world Applications

Paddle pit - Rethinking Spatial Dimensions of Vision Transformers

Lolviz - A simple Python data-structure visualization tool for lists of lists, lists, dictionaries; primarily for use in Jupyter notebooks / presentations

Code repository for the paper Computer Vision User Entity Behavior Analytics

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务