Table of Contents
Fetching ...

On the Optimality of Misspecified Spectral Algorithms

Haobo Zhang, Yicheng Li, Qian Lin

TL;DR

It is shown that spectral algorithms are minimax optimal for any $\alpha_{0}-\frac{1}{\beta}<s<1$, where $\beta$ is the eigenvalue decay rate of $\mathcal{H}$.

Abstract

In the misspecified spectral algorithms problem, researchers usually assume the underground true function $f_ρ^{*} \in [\mathcal{H}]^{s}$, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) $\mathcal{H}$ for some $s\in (0,1)$. The existing minimax optimal results require $\|f_ρ^{*}\|_{L^{\infty}}<\infty$ which implicitly requires $s > α_{0}$ where $α_{0}\in (0,1)$ is the embedding index, a constant depending on $\mathcal{H}$. Whether the spectral algorithms are optimal for all $s\in (0,1)$ is an outstanding problem lasting for years. In this paper, we show that spectral algorithms are minimax optimal for any $α_{0}-\frac{1}β < s < 1$, where $β$ is the eigenvalue decay rate of $\mathcal{H}$. We also give several classes of RKHSs whose embedding index satisfies $ α_0 = \frac{1}β $. Thus, the spectral algorithms are minimax optimal for all $s\in (0,1)$ on these RKHSs.

On the Optimality of Misspecified Spectral Algorithms

TL;DR

It is shown that spectral algorithms are minimax optimal for any , where is the eigenvalue decay rate of .

Abstract

In the misspecified spectral algorithms problem, researchers usually assume the underground true function , a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) for some . The existing minimax optimal results require which implicitly requires where is the embedding index, a constant depending on . Whether the spectral algorithms are optimal for all is an outstanding problem lasting for years. In this paper, we show that spectral algorithms are minimax optimal for any , where is the eigenvalue decay rate of . We also give several classes of RKHSs whose embedding index satisfies . Thus, the spectral algorithms are minimax optimal for all on these RKHSs.
Paper Structure (28 sections, 29 theorems, 197 equations, 2 figures, 2 tables)

This paper contains 28 sections, 29 theorems, 197 equations, 2 figures, 2 tables.

Key Result

Theorem 1

Suppose that Assumption ass EDR,assumption embedding, ass source condition and ass mom of error hold for $0 < s \le 2 \tau$ and $\frac{1}{\beta} \le \alpha_{0} < 1$. Let $\hat{f}_{\nu}$ be the estimator defined by SA estimator. Then for $0 \le \gamma \le 1$ with $\gamma \le s$:

Figures (2)

  • Figure 1: Error decay curves of two kinds of RKHSs and three kinds of spectral algorithms with the best choice of $c$. Both axes are are scaled logarithmically. The curves show the average generalization errors over 50 trials; the regions within one standard deviation are shown in green. The dashed black lines are computed using logarithmic least-squares and the slopes represent the convergence rates $r$. Figures in the first row correspond to the Sobolev RKHS $\mathcal{H} = H^{1}(\mathcal{X})$ and the second correspond to the $\mathcal{H} = \mathcal{H}_{\text{min}}(\mathcal{X})$.
  • Figure 2: Error decay curves of two kinds of RKHSs and three kinds of spectral algorithms with different choices of $c$. Both axes are logarithmic.

Theorems & Definitions (40)

  • Definition 1: Filter function
  • Definition 2: spectral algorithm
  • Example 1: Kernel ridge regression
  • Example 2: Gradient flow
  • Example 3: Spectral cut-off
  • Theorem 1: Upper bound
  • Theorem 2: Lower bound
  • Remark 3: Optimality
  • Lemma 4
  • Theorem 5: $L^{q}$-embedding property
  • ...and 30 more