DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

Overview

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

By Anargyros Chatzitofis, Dimitris Zarpalas, Stefanos Kollias, Petros Daras.

Introduction

DeepMoCap constitutes a low-cost, marker-based optical motion capture method that consumes multiple spatio-temporally aligned infrared-depth sensor streams using retro-reflective straps and patches (reflectors).

DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust optical data extraction. To this end, the subject's motion is efficiently captured by applying a template-based fitting technique.

Teaser?

Teaser?

This project is licensed under the terms of the license.

Contents

  1. Testing
  2. Datasets
  3. Citation

Testing

For testing the FCN model, please visit "testing/" enabling the 3D optical data extraction from colorized depth and 3D optical flow input. The data should be appropriately formed and the DeepMoCap FCN model should be placed to "testing/model/keras".

The proposed FCN is evaluated on the DMC2.5D dataset measuring mean Average Precision (mAP) for the entire set, based on Percentage of Correct Keypoints (PCK) thresholds (a = 0.05). The proposed method outperforms the competitive methods as shown in the table below.

Method Total Total (without end-reflectors)
CPM 92.16% 95.27%
CPM+PAFs 92.79% 95.61%
CPM+PAFs + 3D OF 92.84% 95.67%
Proposed 93.73% 96.77%

Logo

Supplementaty material (video)

Teaser?

Datasets

Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D).

DMC2.5D

The DMC2.5D Dataset was captured in order to train and test the DeepMoCap FCN. It comprises pairs per view of:

The samples were randomly selected from 8 subjects. More specifically, 25K single-view pair samples were annotated with over 300K total keypoints (i.e., reflector 2D locations of current and previous frames on the image), trying to cover a variety of poses and movements in the scene. 20K, 3K and 2K samples were used for training, validation and testing the FCN model, respectively. The annotation was semi-automatically realized by applying image processing and 3D vision techniques, while the dataset was manually refined using the 2D-reflectorset-annotator.

Teaser?

To get the DMC2.5D dataset, please contact the owner of the repository via github or email ([email protected]).

DMC3D

Teaser?

The DMC3D dataset consists of multi-view depth and skeleton data as well as inertial and ground truth motion capture data. Specifically, 3 Kinect for Xbox One sensors were used to capture the IR-D and Kinect skeleton data along with 9 XSens MT inertial measurement units (IMU) to enable the comparison between the proposed method and inertial MoCap approaches. Further, a PhaseSpace Impulse X2 solution was used to capture ground truth MoCap data. The preparation of the DMC3D dataset required the spatio-temporal alignment of the modalities (Kinect, PhaseSpace, XSens MTs). The setup used for the Kinect recordings provides spatio-temporally aligned IR-D and skeleton frames.

Exercise # of repetitions # of frames Type
Walking on the spot 10-20 200-300 Free
Single arm raise 10-20 300-500 Bilateral
Elbow flexion 10-20 300-500 Bilateral
Knee flexion 10-20 300-500 Bilateral
Closing arms above head 6-12 200-300 Free
Side steps 6-12 300-500 Bilateral
Jumping jack 6-12 200-300 Free
Butt kicks left-right 6-12 300-500 Bilateral
Forward lunge left-right 4-10 300-500 Bilateral
Classic squat 6-12 200-300 Free
Side step + knee-elbow 6-12 300-500 Bilateral
Side reaches 6-12 300-500 Bilateral
Side jumps 6-12 300-500 Bilateral
Alternate side reaches 6-12 300-500 Bilateral
Kick-box kicking 2-6 200-300 Free

The annotation tool for the spatio-temporally alignment of the 3D data will be publicly available soon.

To get the DMC3D dataset, please contact the owner of the repository via github or email ([email protected]).

Citation

This paper has been published in MDPI Sensors, Depth Sensors and 3D Vision Special Issue [PDF]

Please cite the paper in your publications if it helps your research:


@article{chatzitofis2019deepmocap,
  title={DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors},
  author={Chatzitofis, Anargyros and Zarpalas, Dimitrios and Kollias, Stefanos and Daras, Petros},
  journal={Sensors},
  volume={19},
  number={2},
  pages={282},
  year={2019},
  publisher={Multidisciplinary Digital Publishing Institute}
}
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

Minh-Khoi Pham 5 Nov 05, 2022
Indices Matter: Learning to Index for Deep Image Matting

IndexNet Matting This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper: Indices Matt

Hao Lu 357 Nov 26, 2022
Unsupervised captioning - Code for Unsupervised Image Captioning

Unsupervised Image Captioning by Yang Feng, Lin Ma, Wei Liu, and Jiebo Luo Introduction Most image captioning models are trained using paired image-se

Yang Feng 207 Dec 24, 2022
Add-on for importing and auto setup of character creator 3 character exports.

CC3 Blender Tools An add-on for importing and automatically setting up materials for Character Creator 3 character exports. Using Blender in the Chara

260 Jan 05, 2023
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
Kaggle | 9th place single model solution for TGS Salt Identification Challenge

UNet for segmenting salt deposits from seismic images with PyTorch. General We, tugstugi and xuyuan, have participated in the Kaggle competition TGS S

Erdene-Ochir Tuguldur 276 Dec 20, 2022
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
Interpolation-based reduced-order models

Interpolation-reduced-order-models Interpolation-based reduced-order models High-fidelity computational fluid dynamics (CFD) solutions are time consum

Donovan Blais 1 Jan 10, 2022
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
ObjectDetNet is an easy, flexible, open-source object detection framework

Getting started with the ObjectDetNet ObjectDetNet is an easy, flexible, open-source object detection framework which allows you to easily train, resu

5 Aug 25, 2020
Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Chen Guo 58 Dec 24, 2022
Learning from Synthetic Shadows for Shadow Detection and Removal [Inoue+, IEEE TCSVT 2020].

Learning from Synthetic Shadows for Shadow Detection and Removal (IEEE TCSVT 2020) Overview This repo is for the paper "Learning from Synthetic Shadow

Naoto Inoue 67 Dec 28, 2022
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 05, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 04, 2022
Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

Vector AI is a framework designed to make the process of building production grade vector based applications as quickly and easily as possible. Create

Vector AI 267 Dec 23, 2022
SemEval2022 Patronizing and Condescending Language (PCL) Detection

SemEval2022 Patronizing and Condescending Language (PCL) Detection This task is from SemEval 2022. What is Patronizing and Condescending Language (PCL

Daniel Saeedi 0 Aug 05, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
PyTorch common framework to accelerate network implementation, training and validation

pytorch-framework PyTorch common framework to accelerate network implementation, training and validation. This framework is inspired by works from MML

Dongliang Cao 3 Dec 19, 2022
Our implementation used for the MICCAI 2021 FLARE Challenge titled 'Efficient Multi-Organ Segmentation Using SpatialConfiguartion-Net with Low GPU Memory Requirements'.

Efficient Multi-Organ Segmentation Using SpatialConfiguartion-Net with Low GPU Memory Requirements Our implementation used for the MICCAI 2021 FLARE C

Franz Thaler 3 Sep 27, 2022
Title: Graduate-Admissions-Predictor

The purpose of this project is create a predictive model capable of identifying the probability of a person securing an admit based on their personal profile parameters. Simplified visualisations hav

Akarsh Singh 1 Jan 26, 2022