Table of Contents
Fetching ...

NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

Yannan He, Garvita Tiwari, Tolga Birdal, Jan Eric Lenssen, Gerard Pons-Moll

TL;DR

NRDF addresses learning priors for the space of articulated poses by modeling the manifold as the zero level-set of a neural field on the product-quaternion space $(\mathbb{H}_1)^K$ and predicting geodesic distances via a distance field $f_\phi$. It introduces wrapped sampling to shape distance distributions and an adaptive-step Riemannian gradient descent (RDFGrad) that keeps iterates on the manifold. The work connects NRDF with Riemannian Flow Matching and demonstrates improvements in pose generation, inverse kinematics, and monocular pose estimation, with extensions to hands and animals. Overall, NRDF provides a principled, manifold-aware prior that improves realism and diversity of articulated poses while offering efficient optimization-based mapping onto the learned pose manifold.

Abstract

Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train NRDFs only on positive examples, we introduce a new sampling algorithm, ensuring that the geodesic distances follow a desired distribution, yielding a principled distance field learning paradigm. We then devise a projection algorithm to map any random pose onto the level-set by an adaptive-step Riemannian optimizer, adhering to the product manifold of joint rotations at all times. NRDFs can compute the Riemannian gradient via backpropagation and by mathematical analogy, are related to Riemannian flow matching, a recent generative model. We conduct a comprehensive evaluation of NRDF against other pose priors in various downstream tasks, i.e., pose generation, image-based pose estimation, and solving inverse kinematics, highlighting NRDF's superior performance. Besides humans, NRDF's versatility extends to hand and animal poses, as it can effectively represent any articulation.

NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

TL;DR

NRDF addresses learning priors for the space of articulated poses by modeling the manifold as the zero level-set of a neural field on the product-quaternion space and predicting geodesic distances via a distance field . It introduces wrapped sampling to shape distance distributions and an adaptive-step Riemannian gradient descent (RDFGrad) that keeps iterates on the manifold. The work connects NRDF with Riemannian Flow Matching and demonstrates improvements in pose generation, inverse kinematics, and monocular pose estimation, with extensions to hands and animals. Overall, NRDF provides a principled, manifold-aware prior that improves realism and diversity of articulated poses while offering efficient optimization-based mapping onto the learned pose manifold.

Abstract

Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train NRDFs only on positive examples, we introduce a new sampling algorithm, ensuring that the geodesic distances follow a desired distribution, yielding a principled distance field learning paradigm. We then devise a projection algorithm to map any random pose onto the level-set by an adaptive-step Riemannian optimizer, adhering to the product manifold of joint rotations at all times. NRDFs can compute the Riemannian gradient via backpropagation and by mathematical analogy, are related to Riemannian flow matching, a recent generative model. We conduct a comprehensive evaluation of NRDF against other pose priors in various downstream tasks, i.e., pose generation, image-based pose estimation, and solving inverse kinematics, highlighting NRDF's superior performance. Besides humans, NRDF's versatility extends to hand and animal poses, as it can effectively represent any articulation.
Paper Structure (58 sections, 7 theorems, 29 equations, 12 figures, 5 tables, 3 algorithms)

This paper contains 58 sections, 7 theorems, 29 equations, 12 figures, 5 tables, 3 algorithms.

Key Result

Proposition 1

Given $\mathcal{S}$ (hence $f$), we employ an adaptive-step Riemannian optimizer, to project any pose $\boldsymbol{\theta}_0$ onto the plausible poses:

Figures (12)

  • Figure 2: Distance distributions and histograms from different sampling strategies. a) Pose-NDF sampling generates a $\mathcal{X}$-like distribution for large $k$, which does not fit the needs of distance field learning. Our sampling schedule allows to control the distance distribution, e.g. to follow b) Half-Gaussian or c) Exponential distributions. The histograms show the distance to the closest example in $\mathcal{D}$, not the distance to the original example, resulting in a slight distribution shift to the left due to neighbor changes with increasing distance.
  • Figure 3: (a) Pose generation:VPoser generates realistic but somewhat limited diverse poses. Pose-NDF generates highly diverse poses but tends to yield unrealistic results (e.g., the third pose). NRDF demonstrates a balance between diverse and realistic poses. (b) IK Solver from partial/sparse markers: Given partial observation (yellow markers), we perform 3D pose completion. We observe that VPoser SMPL-X:2019 based optimization generates realistic, yet fixed and less diverse poses. Pose-NDF tiwari22posendf generates more diverse, but sometimes unrealistic poses, especially in case of very sparse observations. NRDF generates diverse and realistic poses in all setups.
  • Figure 4: 3D pose and shape estimation from images:(Top): Results from SMPLer-X cai2023smpler, (Bottom): We refine the network prediction using NRDF based optimization pipeline. As highlighted, refined poses align better with the observation.
  • Figure 5: User study for pose similarity assessment: In our user interface, participants rank the similarity between a query pose (green) and its nearest neighbors (blue) from the AMASS dataset. These neighbors are obtained using different distance metrics.
  • Figure 6: Pose generation: We compare pose generation results of our method with VPoser SMPL-X:2019, GMM, FM-Dis, Pose-NDF tiwari22posendf, GAN-S Davydov_2022_CVPR, GFPose-A ci2022gfpose and GFPose-Q. In comparison to VPoser, our method produces more diverse results. Furthermore, when compared to GMM, FM-Dis, and Pose-NDF, our method generates more realistic poses.
  • ...and 7 more figures

Theorems & Definitions (26)

  • Definition 1: Riemannian Gradient
  • Definition 2: Riemannian Optimization
  • Definition 3: Exponential map
  • Definition 4: Logarithmic map
  • Definition 5: Quaternion geodesic distance ($d_q$)
  • Definition 6: Geometry of 3D articulated poses
  • Definition 7: Riemannian Distance Fields (RDFs)
  • Proposition 1: RDFGrad
  • Proposition 2: Quaternion-egrad2rgrad
  • proof : Sketch of the proof
  • ...and 16 more