PyTorch reimplementation of minimal-hand (CVPR2020)

Overview

Minimal Hand Pytorch

Unofficial PyTorch reimplementation of minimal-hand (CVPR2020).

demo demo

you can also find in youtube or bilibili

This project reimplement following components :

  1. Training (DetNet) and Evaluation Code
  2. Shape Estimation
  3. Pose Estimation: Instead of IKNet in original paper, an analytical inverse kinematics method is used.

Offical project link: [minimal-hand]

Update

  • 2021/03/09 update about utils/LM.py, time cost drop from 12s/item to 1.57s/item

  • 2021/03/12 update about utils/LM.py, time cost drop from 1.57s/item to 0.27s/item

  • 2021/03/17 realtime perfomance is achieved when using PSO to estimate shape, coming soon

  • 2021/03/20 Add PSO to estimate shape. AUC is decreased by about 0.01 on STB and RHD datasets, and increased a little on EO and do datasets. Modifiy utlis/vis.py to improve realtime perfomance

  • 2021/03/24 Fixed some errors in calculating AUC. Update the 3D PCK AUC Diffenence.

Usage

  • Retrieve the code
git clone https://github.com/MengHao666/Minimal-Hand-pytorch
cd Minimal-Hand-pytorch
  • Create and activate the virtual environment with python dependencies
conda env create --file=environment.yml
conda activate minimal-hand-torch

Prepare MANO hand model

  1. Download MANO model from here and unzip it.

  2. Create an account by clicking Sign Up and provide your information

  3. Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license.

  4. unzip and copy the content of the models folder into the mano folder

  5. Your structure should look like this:

Minimal-Hand-pytorch/
   mano/
      models/
      webuser/

Download and Prepare datasets

Training dataset

Evaluation dataset

Processing

  • Create a data directory, extract all above datasets or additional materials in it

Now your data folder structure should like this:

data/

    CMU/
        hand143_panopticdb/
            datasets/
            ...
        hand_labels/
            datasets/
            ...

    RHD/
        RHD_published_v2/
            evaluation/
            training/
            view_sample.py
            ...

    GANeratedHands_Release/
        data/
        ...

    STB/
        images/
            B1Counting/
                SK_color_0.png
                SK_depth_0.png
                SK_depth_seg_0.png  <-- merged from STB_supp
                ...
            ...
        labels/
            B1Counting_BB.mat
            ...

    dexter+object/
        calibration/
        bbox_dexter+object.csv
        DO_pred_2d.npy
        data/
            Grasp1/
                annotations/
                    Grasp13D.txt
                    my_Grasp13D.txt
                    ...
                ...
            Grasp2/
                annotations/
                    Grasp23D.txt
                    my_Grasp23D.txt
                    ...
                ...
            Occlusion/
                annotations/
                    Occlusion3D.txt
                    my_Occlusion3D.txt
                    ...
                ...
            Pinch/
                annotations/
                    Pinch3D.txt
                    my_Pinch3D.txt
                    ...
                ...
            Rigid/
                annotations/
                    Rigid3D.txt
                    my_Rigid3D.txt
                    ...
                ...
            Rotate/
                                annotations/
                    Rotate3D.txt
                    my_Rotate3D.txt
                    ...
                ...
        

    EgoDexter/
        preview/
        data/
            Desk/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Fruits/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Kitchen/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
            Rotunda/
                annotation.txt_3D.txt
                my_annotation.txt_3D.txt
                ...
        

Note

  • All code and data from these download falls under their own licenses.
  • DO represents "dexter+object" dataset; EO represents "EgoDexter" dataset
  • DO_supp and EO_supp are modified from original ones.
  • DO_pred_2d.npy are 2D predictions from 2D part of DetNet.
  • some labels of DO and EO is obviously wrong (u could find some examples with original labels from dexter_object.py or egodexter.py), when projected into image plane, thus should be omitted. Here come my_{}3D.txt and my_annotation.txt_3D.txt.

Download my Results

realtime demo

python demo.py

DetNet Training and Evaluation

Run the training code

python train_detnet.py --data_root data/

Run the evaluation code

python train_detnet.py --data_root data/  --datasets_test testset_name_to_test   --evaluate  --evaluate_id checkpoints_id_to_load 

or use my results

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "rhd" --evaluate  --evaluate_id 106

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "stb" --evaluate  --evaluate_id 71

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "do" --evaluate  --evaluate_id 68

python train_detnet.py --checkpoint my_results/checkpoints  --datasets_test "eo" --evaluate  --evaluate_id 101

Shape Estimation

Run the shape optimization code. This can be very time consuming when the weight parameter is quite small.

python optimize_shape.py --weight 1e-5

or use my results

python optimize_shape.py --path my_results/out_testset/

Pose Estimation

Run the following code which uses a analytical inverse kinematics method.

python aik_pose.py

or use my results

python aik_pose.py --path my_results/out_testset/

Detnet training and evaluation curve

Run the following code to see my results

python plot.py --path my_results/out_loss_auc

(AUC means 3D PCK, and ACC_HM means 2D PCK) teaser

3D PCK AUC Diffenence

* means this project

Dataset DetNet(paper) DetNet(*) DetNet+IKNet(paper) DetNet+LM+AIK(*) DetNet+PSO+AIK(*)
RHD - 0.9339 0.856 0.9301 0.9310
STB 0.891 0.8744 0.898 0.8647 0.8671
DO 0.923 0.9378 0.948 0.9392 0.9342
EO 0.804 0.9270 0.811 0.9288 0.9277

Note

  • Adjusting training parameters carefully, longer training time, more complicated network or Biomechanical Constraint Losses could further boost accuracy.
  • As there is no official open source of original paper, above comparison is a little rough.

Citation

This is the unofficial pytorch reimplementation of the paper "Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data" (CVPR 2020).

If you find the project helpful, please star this project and cite them:

@inproceedings{zhou2020monocular,
  title={Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data},
  author={Zhou, Yuxiao and Habermann, Marc and Xu, Weipeng and Habibie, Ikhsanul and Theobalt, Christian and Xu, Feng},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={0--0},
  year={2020}
}

Acknowledgement

  • Code of Mano Pytorch Layer was adapted from manopth.

  • Code for evaluating the hand PCK and AUC in utils/eval/zimeval.py was adapted from hand3d.

  • Part code of data augmentation, dataset parsing and utils were adapted from bihand and 3D-Hand-Pose-Estimation.

  • Code of network model was adapted from Minimal-Hand.

  • @Mrsirovo for the starter code of the utils/LM.py, @maitetsu update it later.

  • @maitetsu for the starter code of the utils/AIK.py

Owner
Hao Meng
Master student at Beihang University , mainly interested in hand pose estimation. (LOOKING FOR RESEARCH INTERNSHIP NOW.)
Hao Meng
A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swar.

Omni-swarm A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swarm Introduction Omni-swarm is a decentralized omn

HKUST Aerial Robotics Group 99 Dec 23, 2022
Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Fake videos detection by tracing the source using video hashing retrieval.

Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos 🎉️ 📜 Directory Introduction VTL Trace Samples and Acc of Hash

56 Dec 22, 2022
Multiple style transfer via variational autoencoder

ST-VAE Multiple style transfer via variational autoencoder By Zhi-Song Liu, Vicky Kalogeiton and Marie-Paule Cani This repo only provides simple testi

13 Oct 29, 2022
Minimal fastai code needed for working with pytorch

fastai_minima A mimal version of fastai with the barebones needed to work with Pytorch #all_slow Install pip install fastai_minima How to use This lib

Zachary Mueller 14 Oct 21, 2022
PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

PG2Net PG2Net:Personalized and Group Preference Guided Network for Next Place Prediction Datasets Experiment results on two Foursquare check-in datase

Urban Mobility 5 Dec 20, 2022
HyperCube: Implicit Field Representations of Voxelized 3D Models

HyperCube: Implicit Field Representations of Voxelized 3D Models Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzcinski, Przemysław Spurek [Pap

Magdalena Proszewska 3 Mar 09, 2022
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022
A tool for making map images from OpenTTD save games

OpenTTD Surveyor A tool for making map images from OpenTTD save games. This is not part of the main OpenTTD codebase, nor is it ever intended to be pa

Aidan Randle-Conde 9 Feb 15, 2022
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Winnie Xu 95 Nov 26, 2021
MogFace: Towards a Deeper Appreciation on Face Detection

MogFace: Towards a Deeper Appreciation on Face Detection Introduction In this repo, we propose a promising face detector, termed as MogFace. Our MogFa

48 Dec 20, 2022
This repo implements a 3D segmentation task for an airport baggage dataset.

3D CT Scan Segmentation With Occupancy Network This repo implements a 3D superresolution segmentation task for an airport baggage dataset. Our final p

Christoph Reich 2 Mar 28, 2022
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022
Exemplo de implementação do padrão circuit breaker em python

fast-circuit-breaker Circuit breakers existem para permitir que uma parte do seu sistema falhe sem destruir todo seu ecossistema de serviços. Michael

James G Silva 17 Nov 10, 2022
Cweqgen - The CW Equation Generator

The CW Equation Generator The cweqgen (pronouced like "Queck-Jen") package provi

2 Jan 15, 2022
A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks

A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks Please follow Faster R-CNN and DAF to complete the enviro

2 Oct 07, 2022
An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

iftopt An Implicit Function Theorem (IFT) optimizer for bi-level optimizations. Requirements Python 3.7+ PyTorch 1.x Installation $ pip install git+ht

The Money Shredder Lab 2 Dec 02, 2021
AIR^2 for Interaction Prediction

This is the repository for AIR^2 for Interaction Prediction. Explanation of the solution: Video: link License AIR is released under the Apache 2.0 lic

21 Sep 27, 2022
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 04, 2023