[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

Overview

mmTransformer

Introduction

  • This repo is official implementation for mmTransformer in pytorch. Currently, the core code of mmTransformer is implemented in the commercial project, we provide inference code of model with six trajectory propopals for your reference.

  • For other information, please refer to our paper Multimodal Motion Prediction with Stacked Transformers. (CVPR 2021) [Paper] [Webpage]

img

Set up your virtual environment

  • Initialize virtual environment:

    conda create -n mmTrans python=3.7
    
  • Install agoverse api. Please refer to this page.

  • Install the pytorch. The latest codes are tested on Ubuntu 16.04, CUDA11.1, PyTorch 1.8 and Python 3.7: (Note that we require the version of torch >= 1.5.0 for testing with pretrained model)

    pip install torch==1.8.0+cu111\
          torchvision==0.9.0+cu111\
          torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
    
  • For other requirement, please install with following command:

    pip install -r requirement.txt
    

Preparation

Download the code, model and data

  1. Clone this repo from the GitHub.

     git clone https://github.com/decisionforce/mmTransformer.git
    
  2. Download the pretrained model and data [here] (map.pkl for Python 3.7 is available [here]) and save it to ./models and ./interm_data.

     cd mmTransformer
     mkdir models
     mkdir interm_data
    
  3. Finally, your directory structure should look something like this:

     mmTransformer
     └── models
         └── demo.pt
     └── interm_data
         └── argoverse_info_val.pkl
         └── map.pkl
    

Preprocess the dataset

Alternatively, you can process the data from scratch using following commands.

  1. Download Argoverse dataset and create a symbolic link to ./data folder or use following commands.

     cd path/to/mmtransformer/root
     mkdir data
     cd data
     wget https://s3.amazonaws.com/argoai-argoverse/forecasting_val_v1.1.tar.gz 
     tar -zxvf  forecasting_val_v1.1.tar.gz
    
  2. Then extract the agent and map information from raw data via Argoverse API:

     python -m lib.dataset.argoverse_convertor ./config/demo.py
    
  3. Finally, your directory structure should look something like above illustrated.

Format of processed data in ‘argoverse_info_val.pkl’:

img

Format of map information in ‘map.pkl’:

img

Run the mmTransformer

For testing:

python Evaluation.py ./config/demo.py --model-name demo

Results

Here we showcase the expected results on validation set:

Model Expected results Results in paper
minADE 0.709 0.713
minFDE 1.081 1.153
MR (K=6) 10.2 10.6

TODO

  • We are going to open source our visualization tools and a demo result. (TBD)

Contact us

If you have any issues with the code, please contact to this email: [email protected]

Citation

If you find our work useful for your research, please consider citing the paper

@article{liu2021multimodal,
  title={Multimodal Motion Prediction with Stacked Transformers},
  author={Liu, Yicheng and Zhang, Jinghuai and Fang, Liangji and Jiang, Qinhong and Zhou, Bolei},
  journal={Computer Vision and Pattern Recognition},
  year={2021}
}
Owner
DeciForce: Crossroads of Machine Perception and Autonomy
Research on Unifying Machine Perception and Autonomy in Zhou Group
DeciForce: Crossroads of Machine Perception and Autonomy
Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

Python cx_Oracle Notebooks, 2022 The repository contains Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Da

Christopher Jones 13 Dec 15, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Occlusion Robust 3D face Reconstruction Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, and Seong-Whan Lee Code for Occlusion Robust 3D Face Reconstruction

Yeongjoon 31 Dec 19, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks Anonymised repository for paper submitted for peer review at ACM HEALTH (October 2021).

0 May 10, 2022
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR)

Ilya Kostrikov 3k Dec 31, 2022
Facial recognition project

Facial recognition project documentation Project introduction This project is developed by linuxu. It is a face model recognition project developed ba

Jefferson 2 Dec 04, 2022
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022
Social Network Ads Prediction

Social network advertising, also social media targeting, is a group of terms that are used to describe forms of online advertising that focus on social networking services.

Khazar 2 Jan 28, 2022
Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation: Work In Progress, Results can't be replicated yet with the m

Yad Konrad 196 Aug 30, 2022
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 05, 2023
BASH - Biomechanical Animated Skinned Human

We developed a method animating a statistical 3D human model for biomechanical analysis to increase accessibility for non-experts, like patients, athletes, or designers.

Machine Learning and Data Analytics Lab FAU 66 Nov 19, 2022
Codebase for BMVC 2021 paper "Text Based Person Search with Limited Data"

Text Based Person Search with Limited Data This is the codebase for our BMVC 2021 paper. Please bear with me refactoring this codebase after CVPR dead

Xiao Han 33 Nov 24, 2022
A Python package for time series augmentation

tsaug tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to conn

Arundo Analytics 278 Jan 01, 2023
An Straight Dilated Network with Wavelet for image Deblurring

SDWNet: A Straight Dilated Network with Wavelet Transformation for Image Deblurring(offical) 1. Introduction This repo is not only used for our paper(

FlyEgle 41 Jan 04, 2023
The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Character Transformations for Non-Autoregressive GEC Tagging Milan Straka, Jakub Náplava, Jana Straková Charles University Faculty of Mathematics and

ÚFAL 5 Dec 10, 2022
CTRL-C: Camera calibration TRansformer with Line-Classification

CTRL-C: Camera calibration TRansformer with Line-Classification This repository contains the official code and pretrained models for CTRL-C (Camera ca

57 Nov 14, 2022