Table of Contents
Fetching ...

Operator learning without the adjoint

Nicolas Boullé, Diana Halikias, Samuel E. Otto, Alex Townsend

TL;DR

This paper proves that without querying the adjoint, one can approximate a family of non-self-adjoint infinite-dimensional compact operators via projection onto a Fourier basis and derive an adjoint-free sample complexity bound.

Abstract

There is a mystery at the heart of operator learning: how can one recover a non-self-adjoint operator from data without probing the adjoint? Current practical approaches suggest that one can accurately recover an operator while only using data generated by the forward action of the operator without access to the adjoint. However, naively, it seems essential to sample the action of the adjoint. In this paper, we partially explain this mystery by proving that without querying the adjoint, one can approximate a family of non-self-adjoint infinite-dimensional compact operators via projection onto a Fourier basis. We then apply the result to recovering Green's functions of elliptic partial differential operators and derive an adjoint-free sample complexity bound. While existing theory justifies low sample complexity in operator learning, ours is the first adjoint-free analysis that attempts to close the gap between theory and practice.

Operator learning without the adjoint

TL;DR

This paper proves that without querying the adjoint, one can approximate a family of non-self-adjoint infinite-dimensional compact operators via projection onto a Fourier basis and derive an adjoint-free sample complexity bound.

Abstract

There is a mystery at the heart of operator learning: how can one recover a non-self-adjoint operator from data without probing the adjoint? Current practical approaches suggest that one can accurately recover an operator while only using data generated by the forward action of the operator without access to the adjoint. However, naively, it seems essential to sample the action of the adjoint. In this paper, we partially explain this mystery by proving that without querying the adjoint, one can approximate a family of non-self-adjoint infinite-dimensional compact operators via projection onto a Fourier basis. We then apply the result to recovering Green's functions of elliptic partial differential operators and derive an adjoint-free sample complexity bound. While existing theory justifies low sample complexity in operator learning, ours is the first adjoint-free analysis that attempts to close the gap between theory and practice.
Paper Structure (25 sections, 19 theorems, 186 equations, 5 figures, 3 algorithms)

This paper contains 25 sections, 19 theorems, 186 equations, 5 figures, 3 algorithms.

Key Result

Lemma 3

Let $U$ and $V$ be two $n\times k$ matrices with orthonormal columns and $U^\ast V = Q_l \Sigma Q_r^\ast\in \mathbb{C}^{k\times k}$ be a complete SVD. Then, the orthonormal matrix $Q_0 = Q_l Q_r^\ast$ satisfies

Figures (5)

  • Figure 1: Green's functions learned by a rational neural network (top row) along with the absolute error with the exact Green's function (bottom row) for the stationary convection-diffusion equation, with coefficients (a) $c=0$, (b) $c=5$, and (c) $c=10$.
  • Figure 2: (a) Relative errors for learning the Green's function of the advection-diffusion equation. The graph displays the mean error over ten runs, along with error bars representing the first and third quartiles. (b) Relative errors of the Green's function after training using input-output pairs sampled on a grid with resolution $s=200$ (dashed line) and evaluated at lower and higher resolutions. (c)-(d) Evolution of the loss function after training and relative error as the magnitude of the perturbation increases. The black line in (d) represents the linear least squares approximation and achieves $R^2=0.8$.
  • Figure 3: (a) Left and right-hand side of the bound given by \ref{['thm:operator_projection']} for the approximation of the solution operator of the 1D advection-diffusion equation \ref{['eq_advection_diffusion']} by the eigenfunctions of the Laplacian operator. (b) Convergence of the spectral norm of the matrix $M_n$ to $\|LA^*\|$ as $n$ increases, illustrating \ref{['lem:LAstar_norm']}.
  • Figure 4: Solution to the 3D linear elasticity equations \ref{['eq_linear_elasticity']} with right-hand side $f=(0,0,-1.6\times 10^{-2})$, modeling the deformation of the beam under its weight. The beam is deformed according to the displacement field $u$, and the color map represents the magnitude of the displacement.
  • Figure 5: (a) Normalized left and right-hand sides of the bound \ref{['thm:operator_projection']} for approximating the solution operator of the 2D linear elasticity equations using eigenfunctions of the Laplacian operator. (b) Convergence of the spectral norm of the matrix $M_n$ to the constant $\|LA^*\|$ as $n$ increases. (c)-(d) Same as (a)-(b) but for the 3D linear elasticity equations. The eigenvalues of the solution operator of the Laplacian operator decay as $\mathcal{O}(n^{-1})$ in 2D and $\mathcal{O}(n^{-2/3})$ in 3D, as predicted by \ref{['decay_eig_law']}.

Theorems & Definitions (31)

  • Definition 1: Near-symmetry
  • Remark 2: Low-rank recovery algorithms and $\Omega_{F, X}^\epsilon$
  • Lemma 3
  • Theorem 4: Upper bound
  • Theorem 5: Lower bound
  • Remark 6: Orthonormal test matrices
  • Remark 7
  • Lemma 8
  • Theorem 9
  • Remark 10: Role of $\|LA^*\|$
  • ...and 21 more