Table of Contents
Fetching ...

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

Sven Dummer, Nicola Strisciuglio, Christoph Brune

TL;DR

This work introduces RDA-INR, a framework that unites resolution-independent implicit neural representations with Large Deformation Diffeomorphic Metric Mapping (LDDMM) PGA to enable resolution-invariant, physically consistent shape modeling on point clouds and meshes. By embedding LDDMM’s Riemannian geometry into a nonlinear INR-based latent model and solving the resulting flows with neural ODEs, the approach enables meaningful Fréchet means and geodesics in the shape space, robust atlas building, and improved data-variability modeling. Empirical results on synthetic rectangles and liver data show that the Riemannian regularization improves mean–variance analysis, template learning, generalization, and noise robustness compared to non-Riemannian baselines and to related neural-diffeomorphic models. The framework demonstrates how data representations, Riemannian geometry, and deep learning can be integrated to advance registration, atlas construction, and statistical shape analysis with diffeomorphic guarantees. These results pave the way for broader applications in shape and image analysis, including potential extensions to generative modeling and image-domain data.

Abstract

Diffeomorphic registration frameworks such as Large Deformation Diffeomorphic Metric Mapping (LDDMM) are used in computer graphics and the medical domain for atlas building, statistical latent modeling, and pairwise and groupwise registration. In recent years, researchers have developed neural network-based approaches regarding diffeomorphic registration to improve the accuracy and computational efficiency of traditional methods. In this work, we focus on a limitation of neural network-based atlas building and statistical latent modeling methods, namely that they either are (i) resolution dependent or (ii) disregard any data- or problem-specific geometry needed for proper mean-variance analysis. In particular, we overcome this limitation by designing a novel encoder based on resolution-independent implicit neural representations. The encoder achieves resolution invariance for LDDMM-based statistical latent modeling. Additionally, the encoder adds LDDMM Riemannian geometry to resolution-independent deep learning models for statistical latent modeling. We investigate how the Riemannian geometry improves latent modeling and is required for a proper mean-variance analysis. To highlight the benefit of resolution independence for LDDMM-based data variability modeling, we show that our approach outperforms current neural network-based LDDMM latent code models. Our work paves the way for more research into how Riemannian geometry, shape respectively image analysis, and deep learning can be combined.

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

TL;DR

This work introduces RDA-INR, a framework that unites resolution-independent implicit neural representations with Large Deformation Diffeomorphic Metric Mapping (LDDMM) PGA to enable resolution-invariant, physically consistent shape modeling on point clouds and meshes. By embedding LDDMM’s Riemannian geometry into a nonlinear INR-based latent model and solving the resulting flows with neural ODEs, the approach enables meaningful Fréchet means and geodesics in the shape space, robust atlas building, and improved data-variability modeling. Empirical results on synthetic rectangles and liver data show that the Riemannian regularization improves mean–variance analysis, template learning, generalization, and noise robustness compared to non-Riemannian baselines and to related neural-diffeomorphic models. The framework demonstrates how data representations, Riemannian geometry, and deep learning can be integrated to advance registration, atlas construction, and statistical shape analysis with diffeomorphic guarantees. These results pave the way for broader applications in shape and image analysis, including potential extensions to generative modeling and image-domain data.

Abstract

Diffeomorphic registration frameworks such as Large Deformation Diffeomorphic Metric Mapping (LDDMM) are used in computer graphics and the medical domain for atlas building, statistical latent modeling, and pairwise and groupwise registration. In recent years, researchers have developed neural network-based approaches regarding diffeomorphic registration to improve the accuracy and computational efficiency of traditional methods. In this work, we focus on a limitation of neural network-based atlas building and statistical latent modeling methods, namely that they either are (i) resolution dependent or (ii) disregard any data- or problem-specific geometry needed for proper mean-variance analysis. In particular, we overcome this limitation by designing a novel encoder based on resolution-independent implicit neural representations. The encoder achieves resolution invariance for LDDMM-based statistical latent modeling. Additionally, the encoder adds LDDMM Riemannian geometry to resolution-independent deep learning models for statistical latent modeling. We investigate how the Riemannian geometry improves latent modeling and is required for a proper mean-variance analysis. To highlight the benefit of resolution independence for LDDMM-based data variability modeling, we show that our approach outperforms current neural network-based LDDMM latent code models. Our work paves the way for more research into how Riemannian geometry, shape respectively image analysis, and deep learning can be combined.
Paper Structure (37 sections, 2 theorems, 40 equations, 15 figures, 7 tables)

This paper contains 37 sections, 2 theorems, 40 equations, 15 figures, 7 tables.

Key Result

Theorem 3.2

Assume that $V$ is an admissible Banach space. Then $(G, d_G)$ is a complete metric space. Furthermore, assume we consider a set $\mathcal{O}:=\{\phi \cdot O_{\text{temp}} \mid \phi \in G\}$ for some template object $O_{\text{temp}}$. Then we can define a pseudo-distance $d_\mathcal{O}(O_1, O_2)$ on or alternatively: In addition, $d_\mathcal{O}$ is a distance if the action $\phi \rightarrow \phi

Figures (15)

  • Figure 1: Overview of our RDA-INR framework. We combine LDDMM PGA and resolution-independent implicit neural representation (INR) methods used for joint encoding and registration. This combination yields a method for statistical latent modeling and atlas building that is (i) physically consistent via LDDMM Riemannian geometry and (ii) resolution independent. \newlabelfig:story_of_paper0
  • Figure 1: RDA-INR framework for resolution-independent and physically-consistent statistical latent modeling and atlas building. Our method maps a latent vector $z_i$ to an INR representing a time-dependent vector field $v_\varphi(\cdot, t, z_i)$. This vector field defines a flow of diffeomorphisms $\phi_t^{z_i}$ via an ODE. The diffeomorphisms $\phi_1^{z_i}$ create (reconstructed) objects $O_i$ on a subset of the Riemannian LDDMM manifold $\mathcal{M}$ by deforming a learned template $\mathcal{T}_\theta$. We obtain this deformation by parameterizing $\mathcal{T}_\theta$ with an INR $f_\theta$ and deforming $f_\theta$ via a group action: $I_i(\cdot, t) := (\phi_t^{z_i})^{-1} \cdot f_\theta$. As the template and the vector fields are parameterized by INRs, $I_i$ is a 4D INR deforming the template at $t=0$ to a reconstruction at $t=1$. The goal is to learn the template $\mathcal{T}_\theta$ as Fréchet mean of the data and to learn the $I_i(\cdot, t)$ paths as geodesics on $\mathcal{M}$ between the template $\mathcal{T}_\theta$ and the data.
  • Figure 1: Template (atlas) building. The learned templates of two different models trained on the rectangles (Rect.) dataset and the liver dataset. We use the model learned using the Riemannian LDDMM regularization (a and c) and the model learned using the non-Riemannian pointwise loss (b and d). We use $\eta = 0.05$ and $\eta=50$ in Equation \ref{['eq:Killing_energy']} for the rectangles and liver dataset, respectively.
  • Figure 1: Learned templates. The template of the model learned with occupancy functions (left) and the model learned with the point cloud data (right). Both templates resemble the rectangles data.
  • Figure 1: Neural network architectures. Architectures for the template neural network $f_\theta$ and the stationary velocity vector fields $v_{\varphi_k}$ in Equation \ref{['eq:v_varphi_quasi_time_varying_velocity_field']}. The small red boxes indicate the latent code input $z$ and the spatial input $x$. Furthermore, the small blue box is an output, while the rectangles are linear layers with $d_{vel}$ or $d_\mu$ output dimensions. Finally, the $\oplus$ and $\otimes$ stand for elementwise addition and scalar/elementwise multiplication, respectively.
  • ...and 10 more figures

Theorems & Definitions (12)

  • Definition 3.1: Admissible Banach spaces Younes2019
  • Theorem 3.2: Younes2019
  • Remark 4.1
  • Remark 4.2
  • Remark 4.3
  • Definition A.1: Tangent space
  • Definition A.2: Riemannian manifold
  • Remark A.3
  • Definition A.4: Fréchet mean
  • Definition A.5: Geodesic submanifold
  • ...and 2 more