Table of Contents
Fetching ...

Rates and architectures for learning geometrically non-trivial operators

T. Mitchell Roddenberry, Leo Tzou, Ivan Dokmanić, Maarten V. de Hoop, Richard G. Baraniuk

TL;DR

This work extends operator learning to geometrically non-trivial mappings by introducing double fibration transforms, such as Radon and geodesic ray transforms, and proving they are not cursed by dimensionality when learned from random, bandlimited samples. It develops a levelset integral kernel parameterization that implicitly encodes the incidence relation and amplitude, with a cross-attention–style factorization that achieves universality and stability under discretization. Theoretical results show superalgebraic learning rates under Bolker-type geometry and smoothing-induced bandwidth control, while practical methods enable accurate learning from very few examples. Empirically, the approach excels at learning Radon and ray transforms, identifies codimension through Jacobian analysis, and extends to Riemannian inverse problems, including wavespeed recovery, demonstrating meaningful gains in data efficiency and interpretability for geometric inverse problems.

Abstract

Deep learning methods have proven capable of recovering operators between high-dimensional spaces, such as solution maps of PDEs and similar objects in mathematical physics, from very few training samples. This phenomenon of data-efficiency has been proven for certain classes of elliptic operators with simple geometry, i.e., operators that do not change the domain of the function or propagate singularities. However, scientific machine learning is commonly used for problems that do involve the propagation of singularities in a priori unknown ways, such as waves, advection, and fluid dynamics. In light of this, we expand the learning theory to include double fibration transforms--geometric integral operators that include generalized Radon and geodesic ray transforms. We prove that this class of operators does not suffer from the curse of dimensionality: the error decays superalgebraically, that is, faster than any fixed power of the reciprocal of the number of training samples. Furthermore, we investigate architectures that explicitly encode the geometry of these transforms, demonstrating that an architecture reminiscent of cross-attention based on levelset methods yields a parameterization that is universal, stable, and learns double fibration transforms from very few training examples. Our results contribute to a rapidly-growing line of theoretical work on learning operators for scientific machine learning.

Rates and architectures for learning geometrically non-trivial operators

TL;DR

This work extends operator learning to geometrically non-trivial mappings by introducing double fibration transforms, such as Radon and geodesic ray transforms, and proving they are not cursed by dimensionality when learned from random, bandlimited samples. It develops a levelset integral kernel parameterization that implicitly encodes the incidence relation and amplitude, with a cross-attention–style factorization that achieves universality and stability under discretization. Theoretical results show superalgebraic learning rates under Bolker-type geometry and smoothing-induced bandwidth control, while practical methods enable accurate learning from very few examples. Empirically, the approach excels at learning Radon and ray transforms, identifies codimension through Jacobian analysis, and extends to Riemannian inverse problems, including wavespeed recovery, demonstrating meaningful gains in data efficiency and interpretability for geometric inverse problems.

Abstract

Deep learning methods have proven capable of recovering operators between high-dimensional spaces, such as solution maps of PDEs and similar objects in mathematical physics, from very few training samples. This phenomenon of data-efficiency has been proven for certain classes of elliptic operators with simple geometry, i.e., operators that do not change the domain of the function or propagate singularities. However, scientific machine learning is commonly used for problems that do involve the propagation of singularities in a priori unknown ways, such as waves, advection, and fluid dynamics. In light of this, we expand the learning theory to include double fibration transforms--geometric integral operators that include generalized Radon and geodesic ray transforms. We prove that this class of operators does not suffer from the curse of dimensionality: the error decays superalgebraically, that is, faster than any fixed power of the reciprocal of the number of training samples. Furthermore, we investigate architectures that explicitly encode the geometry of these transforms, demonstrating that an architecture reminiscent of cross-attention based on levelset methods yields a parameterization that is universal, stable, and learns double fibration transforms from very few training examples. Our results contribute to a rapidly-growing line of theoretical work on learning operators for scientific machine learning.

Paper Structure

This paper contains 23 sections, 8 theorems, 76 equations, 5 figures.

Key Result

Proposition 1

Let $R$ be a double fibration transform for an incidence submanifold $Z$ with smooth nonvanishing measure $\mu$. Then, for any $u\in\mathcal{D}'(X)$ and $(y,\eta)\in T^{*}Y$, it holds that $(y,\eta)\in\mathop{\mathrm{WF}}\nolimits(Ru)$ if and only if there is an $(x,\xi)\in\mathop{\mathrm{WF}}\nolim where $N^{*}Z$ denotes the conormal bundle of $Z\subset Y\times X$.

Figures (5)

  • Figure 1: (A) Geometry of the Radon transform (\ref{['example:radon']}). (A1) Measurement domain $Y$, with a point $y$ (blue) and a fiber $H_x$ (red). (A2) Target domain $X$, with the fiber $G_y$ corresponding to $y$ (blue) and the point $x$ defining the fiber $H_x$ (red). (B) The Radon transform of Gabor atoms. (B1) Radon transform of a function $u(x)$ given by the sum of three Gabor atoms, two of which are unmodulated (B2). That is, $\hat{\xi}=0$ for an unmodulated atom $g_{\hat{x},\hat{\xi}}$. Each unmodulated Gabor atom $g_{\hat{x},0}$ is mapped to a neighborhood of the whole fiber $H_{\hat{x}}$, while the modulated Gabor atom is only mapped to a small region dictated by the conormal bundle of $Z$.
  • Figure 2: Levelset integral kernels can learn the Radon transform from few samples. (A) Relative MSE of learned operators when evaluated on ID (A1) and OOD (A2) data, averaged over five training runs. Levelset integral kernels exhibit superalgebraic test error decay in the small-sample ($J\leq 32$) regime, before performance saturates due to discretization. (B) Example operators applied to OOD test image $u$, pictured in (B4). (B1) Estimated transform by MPP from $J=128$ samples. (B2) Estimated transform $L^{\lambda}u$ by levelset integral kernel trained on $J=128$ samples with $m=2$. (B3) Ground truth Radon transform $Ru$. (C) Example training data pair $(u_j,Ru_j)$ where $u_j$ is sampled from a Matérn random field. The levelset methods outperform the others both quantitatively and qualitatively.
  • Figure 3: Relative MSE of learned operators for approximating the Euclidean ray transform in $\mathbb{R}^3$, averaged over five training runs. Overspecification of the codimension ($m=3$) of the integral transform yields superior performance to more constrained methods, especially when compared to the underspecified codimension ($m=1$).
  • Figure 4: The effective rank of the Jacobian evaluated at incident points indicates the codimension of the underlying geometric relation. Distribution of the effective Jacobian rank for learned approximation to the (A) Radon transform in $\mathbb{R}^2$ from $J=128$ training samples, (B) Euclidean ray transform from $J=256$ training samples, and (C) Radon transform in $\mathbb{R}^3$ from $J=256$ training samples. All histograms overlay the empirical distributions over five training runs, with the true underlying codimension $n"$ and the dimension $n$ of the domain marked.
  • Figure 5: Geometry of a learned geodesic ray transform. (A) Levelset function $f(y,x)$ for fixed $x\in X$(A1) and $y\in Y$(A2). The fibers $H_x,G_y$ are shown, with amplitude $a(y,x)$ indicated by the shade (darker$=$larger). (B1) Function $L^\lambda u$ with $(y,\eta)\in\mathop{\mathrm{WF}}\nolimits(Lu)$. (B2) Function $u$ with $(x,\xi)\in\mathop{\mathrm{WF}}\nolimits(f)$, such that $(y,-\eta;x,\xi)\in N^*Z$. (C1,C2) Estimated wavespeeds for $J=32,512$, with true geodesic rays (solid) and estimated rays with same initial point $y\in Y$ (dashed).

Theorems & Definitions (30)

  • Example 1: Diffeomorphism
  • Example 2: Radon Transform
  • Example 3: Euclidean Ray Transform
  • Example 4: Spherical Mean Transform
  • Proposition 1: mazzucchelli2023
  • Definition 1: Bolker Condition GuilleminSternbergGuillemin1985mazzucchelli2023
  • Remark 1
  • Theorem 1
  • Definition 2
  • Remark 2
  • ...and 20 more