Table of Contents
Fetching ...

A Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning

Jia-Qi Yang, Lei Shi

TL;DR

A stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general operator-valued kernels that accommodates a wide range of operator learning tasks, ranging from integral operators such as Fredholm operators to architectures based on encoder-decoder representations.

Abstract

We develop a stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general Mercer operator-valued kernels. Our framework encompasses two key classes: (i) compact kernels, which admit discrete spectral decompositions, and (ii) diagonal kernels of the form $K(x,x')=k(x,x')T$, where $k$ is a scalar-valued kernel and $T$ is a positive operator on the output space. This broad setting induces expressive vector-valued reproducing kernel Hilbert spaces (RKHSs) that generalize the classical $K=kI$ paradigm, thereby enabling rich structural modeling with rigorous theoretical guarantees. To address target operators lying outside the RKHS, we introduce vector-valued interpolation spaces to precisely quantify misspecification error. Within this framework, we establish dimension-free polynomial convergence rates, demonstrating that nonlinear operator learning can overcome the curse of dimensionality. The use of general operator-valued kernels further allows us to derive rates for intrinsically nonlinear operator learning, going beyond the linear-type behavior inherent in diagonal constructions of $K=kI$. Importantly, this framework accommodates a wide range of operator learning tasks, ranging from integral operators such as Fredholm operators to architectures based on encoder-decoder representations. Moreover, we validate its effectiveness through numerical experiments on the two-dimensional Navier-Stokes equations.

A Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning

TL;DR

A stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general operator-valued kernels that accommodates a wide range of operator learning tasks, ranging from integral operators such as Fredholm operators to architectures based on encoder-decoder representations.

Abstract

We develop a stochastic approximation framework for learning nonlinear operators between infinite-dimensional spaces utilizing general Mercer operator-valued kernels. Our framework encompasses two key classes: (i) compact kernels, which admit discrete spectral decompositions, and (ii) diagonal kernels of the form , where is a scalar-valued kernel and is a positive operator on the output space. This broad setting induces expressive vector-valued reproducing kernel Hilbert spaces (RKHSs) that generalize the classical paradigm, thereby enabling rich structural modeling with rigorous theoretical guarantees. To address target operators lying outside the RKHS, we introduce vector-valued interpolation spaces to precisely quantify misspecification error. Within this framework, we establish dimension-free polynomial convergence rates, demonstrating that nonlinear operator learning can overcome the curse of dimensionality. The use of general operator-valued kernels further allows us to derive rates for intrinsically nonlinear operator learning, going beyond the linear-type behavior inherent in diagonal constructions of . Importantly, this framework accommodates a wide range of operator learning tasks, ranging from integral operators such as Fredholm operators to architectures based on encoder-decoder representations. Moreover, we validate its effectiveness through numerical experiments on the two-dimensional Navier-Stokes equations.

Paper Structure

This paper contains 14 sections, 16 theorems, 123 equations, 3 figures.

Key Result

Theorem 2.3

For any $0<\beta<1$, we have and the spaces $[\mathcal{H}_K]^{\beta}$ and $\left[L^2(\mathcal{X},\rho_{\mathcal{X}};\mathcal{Y}),[\mathcal{H}_K]^{1}\right]_{\beta,2}$ have equivalent norms. Concretely, there exist constants $c_\beta$, $C_\beta>0$, such that for any $f\in\ker L_K^\perp$,

Figures (3)

  • Figure 1: Commutative diagram of operator learning framework in Subsection \ref{['Example: Learning via Encoder–Decoder Frameworks']}.
  • Figure 2: Example of a test sample for the Navier-Stokes problem
  • Figure 3: Log–log plots of the prediction and relative errors over iterations in the online setting. Dashed lines indicate linear fits applied from iteration 160 to 34,000. The prediction and relative errors exhibit approximate polynomial decay rates of $\mathcal{O}(t^{-0.79})$ and $\mathcal{O}(t^{-0.42})$, respectively.

Theorems & Definitions (36)

  • Definition 2.1: Vector-valued interpolation space
  • Definition 2.2: $K$-functional triebel1995interpolation
  • Theorem 2.3
  • Remark 1
  • Remark 2
  • Remark 3
  • Theorem 2.4
  • Theorem 2.5
  • Theorem 2.6
  • Proposition 3.1
  • ...and 26 more