Example repository for custom C++/CUDA operators for TorchScript

Overview

Custom TorchScript Operators Example

This repository contains examples for writing, compiling and using custom TorchScript operators. See here for the accompanying tutorial.

Contents

There a few monuments in this repository you can visit. They are described in context in the tutorial, which you are encouraged to read. These monuments are:

  • example_app/warp_perspective/op.cpp: The custom operator implementation,
  • example_app/main.cpp: An example application that loads and executes a serialized TorchScript model, which uses the custom operator, in C++,
  • script.py: Example of using the custom operator in a scripted model,
  • trace.py: Example of using the custom operator in a traced model,
  • eager.py: Example of using the custom operator in vanilla eager PyTorch,
  • load.py: Example of using torch.utils.cpp_extension.load to build the custom operator,
  • load.py: Example of using torch.utils.cpp_extension.load_inline to build the custom operator,
  • setup.py: Example of using setuptools to build the custom operator,
  • test_setup.py: Example of using the custom operator built using setup.py.

To execute the C++ application, first run script.py to serialize a TorchScript model to a file called example.pt, then pass that file to the example_app/build/example_app binary.

Setup

For the smoothest experience when trying out these examples, we recommend building a docker container from this repository's Dockerfile. This will give you a clean, isolated Ubuntu Linux environment in which we guarantee everything to work perfectly. These steps should get you started:

$ git clone https://github.com/pytorch/extension-script

$ cd extension-script

$ docker build -t extension-script .

$ docker run -v $PWD:/home -it extension-script

$ [email protected]:/home# source /activate # Activate the Conda environment

$ cd example_app && mkdir build && cd build

$ cmake -DCMAKE_PREFIX_PATH=/libtorch ..
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found torch: /libtorch/lib/libtorch.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/example_app/build

$ make -j
Scanning dependencies of target warp_perspective
[ 25%] Building CXX object warp_perspective/CMakeFiles/warp_perspective.dir/op.cpp.o
[ 50%] Linking CXX shared library libwarp_perspective.so
[ 50%] Built target warp_perspective
Scanning dependencies of target example_app
[ 75%] Building CXX object CMakeFiles/example_app.dir/main.cpp.o
[100%] Linking CXX executable example_app
[100%] Built target example_app

This will create a shared library under /home/example_app/build/warp_perspective/libwarp_perspective.so containing the custom operator defined in example_app/warp_perspective/op.cpp. Then, you can run the examples, e.g.:

(base) [email protected]:/home# python script.py
graph(%x.1 : Dynamic
      %y : Dynamic) {
  %20 : int = prim::Constant[value=1]()
  %16 : int[] = prim::Constant[value=[0, -1]]()
  %14 : int = prim::Constant[value=6]()
  %2 : int = prim::Constant[value=0]()
  %7 : int = prim::Constant[value=42]()
  %z.1 : int = prim::Constant[value=5]()
  %z.2 : int = prim::Constant[value=10]()
  %13 : int = prim::Constant[value=3]()
  %4 : Dynamic = aten::select(%x.1, %2, %2)
  %6 : Dynamic = aten::select(%4, %2, %2)
  %8 : Dynamic = aten::eq(%6, %7)
  %9 : bool = prim::TensorToBool(%8)
  %z : int = prim::If(%9)
    block0() {
      -> (%z.1)
    }
    block1() {
      -> (%z.2)
    }
  %17 : Dynamic = aten::eye(%13, %14, %2, %16)
  %x : Dynamic = my_ops::warp_perspective(%x.1, %17)
  %19 : Dynamic = aten::matmul(%x, %y)
  %21 : Dynamic = aten::add(%19, %z, %20)
  return (%21);
}

tensor([[11.6196, 12.0056, 11.6122, 12.9298,  7.0649],
        [ 8.5063,  9.0621,  9.9925,  6.3741,  8.9668],
        [12.5898,  6.5872,  8.1511, 10.0806, 11.9829],
        [ 4.9142, 11.6614, 15.7161, 17.0538, 11.7243],
        [10.0000, 10.0000, 10.0000, 10.0000, 10.0000],
        [10.0000, 10.0000, 10.0000, 10.0000, 10.0000],
        [10.0000, 10.0000, 10.0000, 10.0000, 10.0000],
        [10.0000, 10.0000, 10.0000, 10.0000, 10.0000]])
A simple, fast, and efficient object detector without FPN

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides an implementation for

789 Jan 09, 2023
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

Soohwan Kim 565 Jan 04, 2023
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
Pneumonia Detection using machine learning - with PyTorch

Pneumonia Detection Pneumonia Detection using machine learning. Training was done in colab: DEMO: Result (Confusion Matrix): Data I uploaded my datase

Wilhelm Berghammer 12 Jul 07, 2022
Object detection on multiple datasets with an automatically learned unified label space.

Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E

Xingyi Zhou 407 Dec 30, 2022
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Dec 31, 2022
This repository lets you interact with Lean through a REPL.

lean-gym This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-g

OpenAI 87 Dec 28, 2022
A simple code to convert image format and channel as well as resizing and renaming multiple images.

Rename-Resize-and-convert-multiple-images A simple code to convert image format and channel as well as resizing and renaming multiple images. This cod

Happy N. Monday 3 Feb 15, 2022
A fast model to compute optical flow between two input images.

DCVNet: Dilated Cost Volumes for Fast Optical Flow This repository contains our implementation of the paper: @InProceedings{jiang2021dcvnet, title={

Huaizu Jiang 8 Sep 27, 2021
The CLRS Algorithmic Reasoning Benchmark

Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms.

DeepMind 251 Jan 05, 2023
A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

A 3D multi-modal medical image segmentation library in PyTorch We strongly believe in open and reproducible deep learning research. Our goal is to imp

Adaloglou Nikolas 1.2k Dec 27, 2022
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

yidiLi 4 May 08, 2022
Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Caffe SegNet This is a modified version of Caffe which supports the SegNet architecture As described in SegNet: A Deep Convolutional Encoder-Decoder A

Alex Kendall 1.1k Jan 02, 2023
The missing CMake project initializer

cmake-init - The missing CMake project initializer Opinionated CMake project initializer to generate CMake projects that are FetchContent ready, separ

1k Jan 01, 2023
The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Improved Techniques for Training Score-Based Generative Models This repo contains the official implementation for the paper Improved Techniques for Tr

174 Dec 26, 2022
Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Self-attention building blocks for computer vision applications in PyTorch Implementation of self attention mechanisms for computer vision in PyTorch

AI Summer 962 Dec 23, 2022
Advanced Signal Processing Notebooks and Tutorials

Advanced Digital Signal Processing Notebooks and Tutorials Prof. Dr. -Ing. Gerald Schuller Jupyter Notebooks and Videos: Renato Profeta Applied Media

Guitars.AI 115 Dec 13, 2022
This library provides an abstraction to perform Model Versioning using Weight & Biases.

Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod

Hector Lopez Almazan 2 Jan 28, 2022
An open-source, low-cost, image-based weed detection device for fallow scenarios.

Welcome to the OpenWeedLocator (OWL) project, an opensource hardware and software green-on-brown weed detector that uses entirely off-the-shelf compon

Guy Coleman 145 Jan 05, 2023
Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

Training DALL-E with volunteers from all over the Internet This repository is a part of the NeurIPS 2021 demonstration "Training Transformers Together

<a href=[email protected]"> 19 Dec 13, 2022