Table of Contents
Fetching ...

3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario

Feng Shuang, Wenbo He, Shaodong Li

TL;DR

This work presents a novel MANO pose parameters regression module from 2D joints directly, which avoids the process of highly nonlinear mapping from abstract image feature and no longer depends on accurate 3D joints, and outperforms all only model-base approaches and model-free approaches.

Abstract

Recently, 3D hand reconstruction has gained more attention in human-computer cooperation, especially for hand-object interaction scenario. However, it still remains huge challenge due to severe hand-occlusion caused by interaction, which contain the balance of accuracy and physical plausibility, highly nonlinear mapping of model parameters and occlusion feature enhancement. To overcome these issues, we propose a 3D hand reconstruction network combining the benefits of model-based and model-free approaches to balance accuracy and physical plausibility for hand-object interaction scenario. Firstly, we present a novel MANO pose parameters regression module from 2D joints directly, which avoids the process of highly nonlinear mapping from abstract image feature and no longer depends on accurate 3D joints. Moreover, we further propose a vertex-joint mutual graph-attention model guided by MANO to jointly refine hand meshes and joints, which model the dependencies of vertex-vertex and joint-joint and capture the correlation of vertex-joint for aggregating intra-graph and inter-graph node features respectively. The experimental results demonstrate that our method achieves a competitive performance on recently benchmark datasets HO3DV2 and Dex-YCB, and outperforms all only model-base approaches and model-free approaches.

3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario

TL;DR

This work presents a novel MANO pose parameters regression module from 2D joints directly, which avoids the process of highly nonlinear mapping from abstract image feature and no longer depends on accurate 3D joints, and outperforms all only model-base approaches and model-free approaches.

Abstract

Recently, 3D hand reconstruction has gained more attention in human-computer cooperation, especially for hand-object interaction scenario. However, it still remains huge challenge due to severe hand-occlusion caused by interaction, which contain the balance of accuracy and physical plausibility, highly nonlinear mapping of model parameters and occlusion feature enhancement. To overcome these issues, we propose a 3D hand reconstruction network combining the benefits of model-based and model-free approaches to balance accuracy and physical plausibility for hand-object interaction scenario. Firstly, we present a novel MANO pose parameters regression module from 2D joints directly, which avoids the process of highly nonlinear mapping from abstract image feature and no longer depends on accurate 3D joints. Moreover, we further propose a vertex-joint mutual graph-attention model guided by MANO to jointly refine hand meshes and joints, which model the dependencies of vertex-vertex and joint-joint and capture the correlation of vertex-joint for aggregating intra-graph and inter-graph node features respectively. The experimental results demonstrate that our method achieves a competitive performance on recently benchmark datasets HO3DV2 and Dex-YCB, and outperforms all only model-base approaches and model-free approaches.
Paper Structure (27 sections, 12 equations, 6 figures, 3 tables)

This paper contains 27 sections, 12 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of our method that consists of three stages. The Backbone extracts the hand feature $I$ and finish the 2D joint locations ($J_{2D}$) from the input single RGB image. At the initial stage, two separate branches are used to regress the MANO pose parameters and shape parameters respectively. Then, the rough 3D hand mesh vertices ($V$) and joints ($J_{3D}$) are obtained by forwarding the MANO parameters to MANO layer. In the refinement stage, we firstly construct mesh vertex and joint graphs according to the MANO model. In the refinement stage, we use a stack of basic block of the similar structure to generate the refinement hand meshes and joints, where a basic block consists of stack of GCN layers followed by the proposed mutual attention.
  • Figure 2: The SemGCN-based MANO pose parameters Regression module.
  • Figure 3: The MANO pose parameters Regression module based on SemGCN.
  • Figure 4: Qualitative results of our method on the HO3D and Dex-YCB datasets.
  • Figure 5: 3D joints and mesh vertices PCK after procrustes aligned of our method on HO3DV2 dataset.
  • ...and 1 more figures