PoseGraphNet++: Enriching 3D Human Pose with Orientation Estimation
Soubarna Banik, Edvard Avagyan, Sayantan Auddy, Alejandro Mendoza Gracia, Alois Knoll
TL;DR
PoseGraphNet++ tackles the limitation of skeleton-based 3D HPE by predicting both joint positions and bone orientations from 2D poses. It introduces a node-edge graph convolutional network with adaptive adjacency and neighbor-group kernels, using a 6D rotation representation to yield stable bone orientation estimates. The approach achieves near state-of-the-art performance on Human3.6M for both position and orientation, and shows strong generalization to MPI-3DHP and MPI-3DPW, with ablations confirming the benefits of modeling joint-bone relationships. This work enables holistic 3D pose understanding without relying on parametric body models, with potential impact on rehabilitation, action recognition, and real-time animation.
Abstract
Existing skeleton-based 3D human pose estimation methods only predict joint positions. Although the yaw and pitch of bone rotations can be derived from joint positions, the roll around the bone axis remains unresolved. We present PoseGraphNet++ (PGN++), a novel 2D-to-3D lifting Graph Convolution Network that predicts the complete human pose in 3D including joint positions and bone orientations. We employ both node and edge convolutions to utilize the joint and bone features. Our model is evaluated on multiple datasets using both position and rotation metrics. PGN++ performs on par with the state-of-the-art (SoA) on the Human3.6M benchmark. In generalization experiments, it achieves the best results in position and matches the SoA in orientation, showcasing a more balanced performance than the current SoA. PGN++ exploits the mutual relationship of joints and bones resulting in significantly \SB{improved} position predictions, as shown by our ablation results.
