Robust 3D Hand Pose Estimation in Single Depth Images:
from Single-View CNN to Multi-View CNNs

framework

Authors
    Liuhao Ge
    Hui Liang


Abstract
Articulated hand pose estimation plays an important role in human-computer interaction. Despite the recent progress, the accuracy of existing methods is still not satisfactory, partially due to the difficulty of embedded high-dimensional and non-linear regression problem. Different from the existing discriminative methods that regress for the hand pose with a single depth image, we propose to first project the query depth image onto three orthogonal planes and utilize these multi-view projections to regress for 2D heat-maps which estimate the joint positions on each plane. These multi-view heat-maps are then fused to produce final 3D hand pose estimation with learned pose priors. Experiments show that the proposed method largely outperforms state-of-the-art on a challenging dataset. Moreover, a cross-dataset experiment also demonstrates the good generalization ability of the proposed method.


Paper
Liuhao Ge, Hui Liang, Junsong Yuan and Daniel Thalmann, "Robust 3D Hand Pose Estimation in Single Depth Images: from Single-View CNN to Multi-View CNNs", Proceedings of CVPR 2016 [pdf] [bibtex] [video[code]


Video