Table of Contents
Fetching ...

ProGraph: Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction

Hongsheng Wang, Zehui Feng, Tong Xiao, Genfan Yang, Shengyu Zhang, Fei Wu, Feng Lin

TL;DR

This work proposes Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction (ProGraph), and exploits the explicit topological-aware probability distribution across the entire motion sequence for missing parts recovery and to generate blurred motion parts.

Abstract

Current 3D human motion reconstruction methods from monocular videos rely on features within the current reconstruction window, leading to distortion and deformations in the human structure under local occlusions or blurriness in video frames. To estimate realistic 3D human mesh sequences based on incomplete features, we propose Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction (ProGraph). For missing parts recovery, we exploit the explicit topological-aware probability distribution across the entire motion sequence. To restore the complete human, Graph Topological Modeling (GTM) learns the underlying topological structure, focusing on the relationships inherent in the individual parts. Next, to generate blurred motion parts, Temporal-alignable Probability Distribution (TPDist) utilizes the GTM to predict features based on distribution. This interactive mechanism facilitates motion consistency, allowing the restoration of human parts. Furthermore, Hierarchical Human Loss (HHLoss) constrains the probability distribution errors of inter-frame features during topological structure variation. Our Method achieves superior results than other SOTA methods in addressing occlusions and blurriness on 3DPW.

ProGraph: Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction

TL;DR

This work proposes Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction (ProGraph), and exploits the explicit topological-aware probability distribution across the entire motion sequence for missing parts recovery and to generate blurred motion parts.

Abstract

Current 3D human motion reconstruction methods from monocular videos rely on features within the current reconstruction window, leading to distortion and deformations in the human structure under local occlusions or blurriness in video frames. To estimate realistic 3D human mesh sequences based on incomplete features, we propose Temporally-alignable Probability Guided Graph Topological Modeling for 3D Human Reconstruction (ProGraph). For missing parts recovery, we exploit the explicit topological-aware probability distribution across the entire motion sequence. To restore the complete human, Graph Topological Modeling (GTM) learns the underlying topological structure, focusing on the relationships inherent in the individual parts. Next, to generate blurred motion parts, Temporal-alignable Probability Distribution (TPDist) utilizes the GTM to predict features based on distribution. This interactive mechanism facilitates motion consistency, allowing the restoration of human parts. Furthermore, Hierarchical Human Loss (HHLoss) constrains the probability distribution errors of inter-frame features during topological structure variation. Our Method achieves superior results than other SOTA methods in addressing occlusions and blurriness on 3DPW.

Paper Structure

This paper contains 20 sections, 14 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Overview framework. The ProGraph consists of three main components: Temporally-alignable Probability Distribution (TPDist), Graph Topological Modeling (GTM), and Hierarchical Human Loss (HHLoss). Given latent features extracted from the HRNet-W64 backbone, TPDist learns temporally consistent motion features. It leverages the probability distribution of the human body topology within the latent space, as guided by GTM. This enables TPDist to capture information about missing regions due to occlusions.
  • Figure 2: The mapping process between the prior association information of the vertices and the explicit graph structure.
  • Figure 3: The ProGraph network architecture integrates a 3D ResNet with CLIP for cross-attention. It also employs a dual-pathway approach, combining Graph Topological Reconstruction and Temporal 3D Transformers. This approach facilitates the alignment of the topological and geometric structures of human body meshes across a video sequence.
  • Figure 4: An analysis and comparison of inter-frame prediction results. The original image is presented on the far left, with the outputs of our model, Fastmetro, GLoT, and PyMAF displayed sequentially to the right.
  • Figure 5: This figure conducts analysis and comparison of intra-frame prediction results, showcasing the capability to handle complex scenes, such as walking people. The original image is displayed on the far left, followed to the right by the outputs of our model, Fastmetro, GLoT, and PyMAF, respectively.
  • ...and 3 more figures