Table of Contents
Fetching ...

Normalizing Flows on the Product Space of SO(3) Manifolds for Probabilistic Human Pose Modeling

Olaf Dünkel, Tim Salzmann, Florian Pfaff

TL;DR

The paper addresses probabilistic modeling of human pose by learning a normalized density over the product space of SO(3) joint rotations. It introduces HuProSO3, a normalizing flow built from Möbius coupling layers and a quaternion affine transformation, with nonlinear autoregressive conditioning across joints and random conditioning orders to capture complex dependencies; the model can condition on context and support exact likelihood evaluation. The authors demonstrate HuProSO3 on an unconditional pose prior, inverse kinematics with and without occlusion, and 2D-to-3D uplifting, showing improved density modeling and competitive or superior performance to state-of-the-art priors and 6D NF baselines. This work enables uncertainty-aware, geometry-consistent human pose estimation for vision, robotics, and human-robot interaction tasks by modeling correlated joint rotations in a principled probabilistic framework.

Abstract

Normalizing flows have proven their efficacy for density estimation in Euclidean space, but their application to rotational representations, crucial in various domains such as robotics or human pose modeling, remains underexplored. Probabilistic models of the human pose can benefit from approaches that rigorously consider the rotational nature of human joints. For this purpose, we introduce HuProSO3, a normalizing flow model that operates on a high-dimensional product space of SO(3) manifolds, modeling the joint distribution for human joints with three degrees of freedom. HuProSO3's advantage over state-of-the-art approaches is demonstrated through its superior modeling accuracy in three different applications and its capability to evaluate the exact likelihood. This work not only addresses the technical challenge of learning densities on SO(3) manifolds, but it also has broader implications for domains where the probabilistic regression of correlated 3D rotations is of importance.

Normalizing Flows on the Product Space of SO(3) Manifolds for Probabilistic Human Pose Modeling

TL;DR

The paper addresses probabilistic modeling of human pose by learning a normalized density over the product space of SO(3) joint rotations. It introduces HuProSO3, a normalizing flow built from Möbius coupling layers and a quaternion affine transformation, with nonlinear autoregressive conditioning across joints and random conditioning orders to capture complex dependencies; the model can condition on context and support exact likelihood evaluation. The authors demonstrate HuProSO3 on an unconditional pose prior, inverse kinematics with and without occlusion, and 2D-to-3D uplifting, showing improved density modeling and competitive or superior performance to state-of-the-art priors and 6D NF baselines. This work enables uncertainty-aware, geometry-consistent human pose estimation for vision, robotics, and human-robot interaction tasks by modeling correlated joint rotations in a principled probabilistic framework.

Abstract

Normalizing flows have proven their efficacy for density estimation in Euclidean space, but their application to rotational representations, crucial in various domains such as robotics or human pose modeling, remains underexplored. Probabilistic models of the human pose can benefit from approaches that rigorously consider the rotational nature of human joints. For this purpose, we introduce HuProSO3, a normalizing flow model that operates on a high-dimensional product space of SO(3) manifolds, modeling the joint distribution for human joints with three degrees of freedom. HuProSO3's advantage over state-of-the-art approaches is demonstrated through its superior modeling accuracy in three different applications and its capability to evaluate the exact likelihood. This work not only addresses the technical challenge of learning densities on SO(3) manifolds, but it also has broader implications for domains where the probabilistic regression of correlated 3D rotations is of importance.
Paper Structure (32 sections, 12 equations, 15 figures, 11 tables)

This paper contains 32 sections, 12 equations, 15 figures, 11 tables.

Figures (15)

  • Figure 1: Application of the normalizing flow for occcluded joints. Left: A normalizing flow defined on SO(3) manifolds enables the learning of expressive human pose distributions, incorporating a context vector $\mathbf{c}$ for conditioning. Right: Renderings of probable poses given condition $\mathbf{c}$, which is the observation of the human with the left arm occluded. The right arm's pose is estimated with high certainty, while the left arm demonstrates varied but realistic poses due to the occlusion.
  • Figure 2: Overview of the components of the normalizing flow: The flow is defined on a product space of $N$ manifolds $\mathcal{M}_i=SO(3)$. It includes $K$ flow layers that transform samples $R_i$ from a uniform distribution on SO(3) to samples of the learned distribution $\hat{R}_j$. The flow is composed of a Möbius coupling layer (MCL) and a quaternion affine transformation (AF). Vertical arrows indicate the flow of SO(3) samples through layers, while dotted arrows represent autoregressive conditioning using an MLP (black boxes).
  • Figure 3: Qualitative results for inverse kinematics with partial occlusion (left arm and right leg). For the normalizing flow based models (HF AC and HuProSO3), we visualize 10 samples, where less likely poses are more transparent.
  • Figure 4: Minimum MPJPE and MPJPE of the mean pose for HuProSO3, ancestor-conditioned SO(3), and HF-AC for randomly occluded joints ($p_m=0.3$) with varying numbers of samples.
  • Figure 5: Overview of an example application of HuProSO3: Samples from $p(\mathbf{R}|\mathbf{c})$ are generated by propagating samples from a uniform distribution on the product space of SO(3) through the normalizing flow. Partially given 2D key points serve as conditioning, where occluded joints are depicted in red. The prior $p(\mathbf{R})$ captures the statistical dependencies between different joints. The color of the blobs depict the standard deviation of the joint rotations computed for 20 samples. The blob size relates to the standard deviation of the joint position (JP) after applying forward kinematics.
  • ...and 10 more figures