Table of Contents
Fetching ...

Sample-based almost-sure quasi-optimal approximation in reproducing kernel Hilbert spaces

Nando Hegemann, Anthony Nouy, Philipp Trunschke

TL;DR

The paper studies sample-efficient function approximation in RKHSs by introducing a kernel-based projector $P_{\mathcal{V}_d}^{\boldsymbol{x}}$ whose error is quasi-optimal relative to the best $\mathcal{V}$-projection, with a computable bound involving $\mu(\boldsymbol{x})$ and $\tau(\boldsymbol{x})$. It develops probabilistic sampling strategies, notably continuous volume sampling and the novel subspace-informed volume sampling (SIVS), to select point sets that ensure small $\mu(\boldsymbol{x})$ with high probability, achieving near-linear sample complexity in $d$ under favorable eigenvalue decay. The authors also introduce a greedy subsampling scheme with theoretical (approximate) submodularity properties to reduce sample sizes further and extend the framework to noisy evaluations. Through experiments on multiple RKHSs and measures, the method demonstrates substantial improvements over classical sampling strategies, suggesting practical applicability to inverse problems and the PBDW framework. Overall, the work provides a theoretically grounded, computationally feasible approach for almost-sure quasi-optimal approximation from few, strategically chosen samples.

Abstract

This paper addresses the problem of approximating an unknown function from point evaluations. When obtaining these point evaluations is costly, minimising the required sample size becomes crucial, and it is unreasonable to reserve a sufficiently large test sample for estimating the approximation accuracy. Therefore, an approximation with a certified quasi-optimality factor is required. This article shows that such an approximation can be obtained when the sought function lies in a reproducing kernel Hilbert space (RKHS) and is to be approximated in a finite-dimensional linear subspace $\mathcal{V}_d$. However, selecting the sample points to minimise the quasi-optimality factor requires optimising over an infinite set of points and computing exact inner products in RKHS, which is often infeasible in practice. Extending results from optimal sampling for $L^2$ approximation, the present paper proves that random points, drawn independently from the Christoffel sampling distribution associated with $\mathcal{V}_d$, can yield a controllable quasi-optimality factor with high probability. Inspired by this result, a novel sampling scheme, coined subspace-informed volume sampling, is introduced and evaluated in numerical experiments, where it outperforms classical i.i.d. Christoffel sampling and continuous volume sampling. To reduce the size of such a random sample, an additional greedy subsampling scheme with provable suboptimality bounds is introduced. Our presentation is of independent interest to the inverse problems community, as it offers a simpler interpretation of the parametrised background data weak (PBDW) method.

Sample-based almost-sure quasi-optimal approximation in reproducing kernel Hilbert spaces

TL;DR

The paper studies sample-efficient function approximation in RKHSs by introducing a kernel-based projector whose error is quasi-optimal relative to the best -projection, with a computable bound involving and . It develops probabilistic sampling strategies, notably continuous volume sampling and the novel subspace-informed volume sampling (SIVS), to select point sets that ensure small with high probability, achieving near-linear sample complexity in under favorable eigenvalue decay. The authors also introduce a greedy subsampling scheme with theoretical (approximate) submodularity properties to reduce sample sizes further and extend the framework to noisy evaluations. Through experiments on multiple RKHSs and measures, the method demonstrates substantial improvements over classical sampling strategies, suggesting practical applicability to inverse problems and the PBDW framework. Overall, the work provides a theoretically grounded, computationally feasible approach for almost-sure quasi-optimal approximation from few, strategically chosen samples.

Abstract

This paper addresses the problem of approximating an unknown function from point evaluations. When obtaining these point evaluations is costly, minimising the required sample size becomes crucial, and it is unreasonable to reserve a sufficiently large test sample for estimating the approximation accuracy. Therefore, an approximation with a certified quasi-optimality factor is required. This article shows that such an approximation can be obtained when the sought function lies in a reproducing kernel Hilbert space (RKHS) and is to be approximated in a finite-dimensional linear subspace . However, selecting the sample points to minimise the quasi-optimality factor requires optimising over an infinite set of points and computing exact inner products in RKHS, which is often infeasible in practice. Extending results from optimal sampling for approximation, the present paper proves that random points, drawn independently from the Christoffel sampling distribution associated with , can yield a controllable quasi-optimality factor with high probability. Inspired by this result, a novel sampling scheme, coined subspace-informed volume sampling, is introduced and evaluated in numerical experiments, where it outperforms classical i.i.d. Christoffel sampling and continuous volume sampling. To reduce the size of such a random sample, an additional greedy subsampling scheme with provable suboptimality bounds is introduced. Our presentation is of independent interest to the inverse problems community, as it offers a simpler interpretation of the parametrised background data weak (PBDW) method.
Paper Structure (29 sections, 23 theorems, 159 equations, 8 figures, 2 algorithms)

This paper contains 29 sections, 23 theorems, 159 equations, 8 figures, 2 algorithms.

Key Result

Theorem 1

Let $u\in\mathcal{V}$ and $\boldsymbol{x}\in\mathcal{X}^n$. There exist (computable) constants $1\le\mu(\boldsymbol{x})$ and $0\le\tau(\boldsymbol{x})\le1$ such that $P^{\boldsymbol{x}}_{\mathcal{V}_d} u$ is well-defined when $\mu(\boldsymbol{x}) < \infty$ and satisfies Moreover, it holds that $\tau(\boldsymbol{x}) \le \min\{(1 + \sqrt{d})(1 - \mu(\boldsymbol{x})^{-2}),1\}$.

Figures (8)

  • Figure 1: Visualisation of projections involved in the PBDW method and relation of $\mu(x)$ to the angle between $\mathcal{V}_d$ and $\mathcal{V}_x$ for (\ref{['fig:space_visualisation:linear']}) linear $\mathcal{V}_d$, (\ref{['fig:space_visualisation:circle']}) nonlinear $\mathcal{V}_d$ with bounded $\mu(x)$ and (\ref{['fig:space_visualisation:diverging']}) nonlinear $\mathcal{V}_d$ with unbounded $\mu(x)$. Here $u^{d,\boldsymbol{x}}$ is the interpolant defined in \ref{['eqn:def_u_dx']}.
  • Figure 2: Distribution and bounding intervals of $Z_i(\boldsymbol{x}_{<i})$ for $\boldsymbol{x}\in\mathcal{X}^d$ drawn according to $x_i\sim q(\raisebox{\raisebulletlen}{,\tiny$\bullet$}\, \mid \boldsymbol{x}_{<i}) \rho_i$, with $d=10$ and $\mathcal{V}$ and $\mathcal{V}_d$ as defined in sections \ref{['sec:experiments:h1']}, \ref{['sec:experiments:h10']} and \ref{['sec:experiments:gaussian']}.
  • Figure 3: Phase diagrams for the probability $\mathbb{P}[*]{\mu(\boldsymbol{x}) \le 2}$ with $\mathcal{V} = H^1([-1,1], \tfrac{1}{2}\,\mathrm{d}{x})$ and where $\mathcal{V}_d$ is spanned by polynomials. The probability is estimated using $200$ independent samples $\boldsymbol{x}\in\mathcal{X}^n$ for different dimensions $d$ and sample sizes $n$. White marks a probability of $1$, black a probability of $0$. Points having $\mathbb{P}[*]{\mu(\boldsymbol{x}) \le 2} \ge \tfrac{1}{2}$ are marked with bold, red borders. The factor for the linear rate in the phase diagram for subspace-informed volume sampling is $1.5$.
  • Figure 4: Violin plot of the submodular surrogate $\eta$ and the suboptimality constant $\mu$ for the first $20$ steps of the greedy optimisation procedure. The initial sample $\mathcal{D}$ is of size $100$ and drawn using the $L^2$-Christoffel sampling method. The experiment was repeated $100$ times to compute the violins. $\mathcal{V} = H^1([-1,1], \tfrac{1}{2}\,\mathrm{d}{x})$ and $d=10$ with $\mathcal{V}_d$ being spanned by polynomials.
  • Figure 5: Phase diagrams for the probability $\mathbb{P}[*]{\mu(\boldsymbol{x}) \le 2}$ with $\mathcal{V} = H^1_0([-1,1], \tfrac{1}{2}\,\mathrm{d}{x})$ and where $\mathcal{V}_d$ is spanned by polynomials. The probability is estimated using $200$ independent samples $\boldsymbol{x}\in\mathcal{X}^n$ for different dimensions $d$ and sample sizes $n$. White marks a probability of $1$, black a probability of $0$. Points having $\mathbb{P}[*]{\mu(\boldsymbol{x}) \le 2} \ge \tfrac{1}{2}$ are marked with bold, red borders. The factor for the linear rate in the phase diagram for subspace-informed volume sampling is $1.5$.
  • ...and 3 more figures

Theorems & Definitions (56)

  • Theorem
  • Theorem
  • Theorem
  • lemma 1
  • proof
  • lemma 2
  • proof
  • lemma 3
  • proof
  • theorem 4
  • ...and 46 more