RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

Sven Dummer; Nicola Strisciuglio; Christoph Brune

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

Sven Dummer, Nicola Strisciuglio, Christoph Brune

TL;DR

This work introduces RDA-INR, a framework that unites resolution-independent implicit neural representations with Large Deformation Diffeomorphic Metric Mapping (LDDMM) PGA to enable resolution-invariant, physically consistent shape modeling on point clouds and meshes. By embedding LDDMM’s Riemannian geometry into a nonlinear INR-based latent model and solving the resulting flows with neural ODEs, the approach enables meaningful Fréchet means and geodesics in the shape space, robust atlas building, and improved data-variability modeling. Empirical results on synthetic rectangles and liver data show that the Riemannian regularization improves mean–variance analysis, template learning, generalization, and noise robustness compared to non-Riemannian baselines and to related neural-diffeomorphic models. The framework demonstrates how data representations, Riemannian geometry, and deep learning can be integrated to advance registration, atlas construction, and statistical shape analysis with diffeomorphic guarantees. These results pave the way for broader applications in shape and image analysis, including potential extensions to generative modeling and image-domain data.

Abstract

Diffeomorphic registration frameworks such as Large Deformation Diffeomorphic Metric Mapping (LDDMM) are used in computer graphics and the medical domain for atlas building, statistical latent modeling, and pairwise and groupwise registration. In recent years, researchers have developed neural network-based approaches regarding diffeomorphic registration to improve the accuracy and computational efficiency of traditional methods. In this work, we focus on a limitation of neural network-based atlas building and statistical latent modeling methods, namely that they either are (i) resolution dependent or (ii) disregard any data- or problem-specific geometry needed for proper mean-variance analysis. In particular, we overcome this limitation by designing a novel encoder based on resolution-independent implicit neural representations. The encoder achieves resolution invariance for LDDMM-based statistical latent modeling. Additionally, the encoder adds LDDMM Riemannian geometry to resolution-independent deep learning models for statistical latent modeling. We investigate how the Riemannian geometry improves latent modeling and is required for a proper mean-variance analysis. To highlight the benefit of resolution independence for LDDMM-based data variability modeling, we show that our approach outperforms current neural network-based LDDMM latent code models. Our work paves the way for more research into how Riemannian geometry, shape respectively image analysis, and deep learning can be combined.

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

TL;DR

Abstract

Paper Structure (37 sections, 2 theorems, 40 equations, 15 figures, 7 tables)

This paper contains 37 sections, 2 theorems, 40 equations, 15 figures, 7 tables.

Introduction
Contributions
Outline
Related work
Principal geodesic analysis
Riemannian geometry for latent space models
Neural dynamics
Preliminaries
Implicit neural representations and implicit shape representations
Diffeomorphic registration and the LDDMM Riemannian distance
Diffeomorphic latent modeling via LDDMM PGA
Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations
Data fidelity term for shape data
Choice of velocity field parameterization and velocity field regularization term
Encoding objects
...and 22 more sections

Key Result

Theorem 3.2

Assume that $V$ is an admissible Banach space. Then $(G, d_G)$ is a complete metric space. Furthermore, assume we consider a set $\mathcal{O}:=\{\phi \cdot O_{\text{temp}} \mid \phi \in G\}$ for some template object $O_{\text{temp}}$. Then we can define a pseudo-distance $d_\mathcal{O}(O_1, O_2)$ on or alternatively: In addition, $d_\mathcal{O}$ is a distance if the action $\phi \rightarrow \phi

Figures (15)

Figure 1: Overview of our RDA-INR framework. We combine LDDMM PGA and resolution-independent implicit neural representation (INR) methods used for joint encoding and registration. This combination yields a method for statistical latent modeling and atlas building that is (i) physically consistent via LDDMM Riemannian geometry and (ii) resolution independent. \newlabelfig:story_of_paper0
Figure 1: RDA-INR framework for resolution-independent and physically-consistent statistical latent modeling and atlas building. Our method maps a latent vector $z_i$ to an INR representing a time-dependent vector field $v_\varphi(\cdot, t, z_i)$. This vector field defines a flow of diffeomorphisms $\phi_t^{z_i}$ via an ODE. The diffeomorphisms $\phi_1^{z_i}$ create (reconstructed) objects $O_i$ on a subset of the Riemannian LDDMM manifold $\mathcal{M}$ by deforming a learned template $\mathcal{T}_\theta$. We obtain this deformation by parameterizing $\mathcal{T}_\theta$ with an INR $f_\theta$ and deforming $f_\theta$ via a group action: $I_i(\cdot, t) := (\phi_t^{z_i})^{-1} \cdot f_\theta$. As the template and the vector fields are parameterized by INRs, $I_i$ is a 4D INR deforming the template at $t=0$ to a reconstruction at $t=1$. The goal is to learn the template $\mathcal{T}_\theta$ as Fréchet mean of the data and to learn the $I_i(\cdot, t)$ paths as geodesics on $\mathcal{M}$ between the template $\mathcal{T}_\theta$ and the data.
Figure 1: Template (atlas) building. The learned templates of two different models trained on the rectangles (Rect.) dataset and the liver dataset. We use the model learned using the Riemannian LDDMM regularization (a and c) and the model learned using the non-Riemannian pointwise loss (b and d). We use $\eta = 0.05$ and $\eta=50$ in Equation \ref{['eq:Killing_energy']} for the rectangles and liver dataset, respectively.
Figure 1: Learned templates. The template of the model learned with occupancy functions (left) and the model learned with the point cloud data (right). Both templates resemble the rectangles data.
Figure 1: Neural network architectures. Architectures for the template neural network $f_\theta$ and the stationary velocity vector fields $v_{\varphi_k}$ in Equation \ref{['eq:v_varphi_quasi_time_varying_velocity_field']}. The small red boxes indicate the latent code input $z$ and the spatial input $x$. Furthermore, the small blue box is an output, while the rectangles are linear layers with $d_{vel}$ or $d_\mu$ output dimensions. Finally, the $\oplus$ and $\otimes$ stand for elementwise addition and scalar/elementwise multiplication, respectively.
...and 10 more figures

Theorems & Definitions (12)

Definition 3.1: Admissible Banach spaces Younes2019
Theorem 3.2: Younes2019
Remark 4.1
Remark 4.2
Remark 4.3
Definition A.1: Tangent space
Definition A.2: Riemannian manifold
Remark A.3
Definition A.4: Fréchet mean
Definition A.5: Geodesic submanifold
...and 2 more

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

TL;DR

Abstract

RDA-INR: Riemannian Diffeomorphic Autoencoding via Implicit Neural Representations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (12)