Table of Contents
Fetching ...

Worst-case Error Bounds for Online Learning of Smooth Functions

Weian Xie

TL;DR

This work resolves the long-standing question of when the worst-case online-learning error for smooth functions on $[0,1]$ is finite by showing $\operatorname{opt}_p(\mathcal{F}_q)<\infty$ iff $p>1$ and $q>1$, and derives a sharp bound $\operatorname{opt}_{1+\delta}(\mathcal{F}_{1+\varepsilon})=O(\min(\delta,\varepsilon)^{-1})$, unifying prior results. It further shows that restricting to polynomials does not ease learning, i.e., $\operatorname{opt}_p(\mathcal{P}_q)=\operatorname{opt}_p(\mathcal{F}_q)$ via Weierstrass approximation and weighted interpolation, linking general smoothness to polynomial approximants. The paper extends the model to a noisy feedback setting, proving finiteness iff $p,q>1$ and establishing $\operatorname{opt}^{\text{nf}}_{p,\eta}(\mathcal{F}_q)=\Theta(\eta)$ for $p,q\ge2$, with a tight $2\eta+1$ discarding-round requirement. Together, these results complete the finite/finite-bounds landscape for online learning of smooth functions and illuminate the role of noise in adversarial feedback, while suggesting avenues for extending the framework to broader function families via approximation theory.

Abstract

Online learning is a model of machine learning where the learner is trained on sequential feedback. We investigate worst-case error for the online learning of real functions that have certain smoothness constraints. Suppose that $\mathcal{F}_q$ is the class of all absolutely continuous functions $f: [0, 1] \rightarrow \mathbb{R}$ such that $\|f'\|_q \le 1$, and $\operatorname{opt}_p(\mathcal{F}_q)$ is the best possible upper bound on the sum of the $p^{\text{th}}$ powers of absolute prediction errors for any number of trials guaranteed by any learner. We show that for any $δ, ε\in (0, 1)$, $\operatorname{opt}_{1+δ} (\mathcal{F}_{1+ε}) = O(\min(δ, ε)^{-1})$. Combined with the previous results of Kimber and Long (1995) and Geneson and Zhou (2023), we achieve a complete characterization of the values of $p, q \ge 1$ that result in $\operatorname{opt}_p(\mathcal{F}_q)$ being finite, a problem open for nearly 30 years. We study the learning scenarios of smooth functions that also belong to certain special families of functions, such as polynomials. We prove a conjecture by Geneson and Zhou (2023) that it is not any easier to learn a polynomial in $\mathcal{F}_q$ than it is to learn any general function in $\mathcal{F}_q$. We also define a noisy model for the online learning of smooth functions, where the learner may receive incorrect feedback up to $η\ge 1$ times, denoting the worst-case error bound as $\operatorname{opt}^{\text{nf}}_{p, η} (\mathcal{F}_q)$. We prove that $\operatorname{opt}^{\text{nf}}_{p, η} (\mathcal{F}_q)$ is finite if and only if $\operatorname{opt}_p(\mathcal{F}_q)$ is. Moreover, we prove for all $p, q \ge 2$ and $η\ge 1$ that $\operatorname{opt}^{\text{nf}}_{p, η} (\mathcal{F}_q) = Θ(η)$.

Worst-case Error Bounds for Online Learning of Smooth Functions

TL;DR

This work resolves the long-standing question of when the worst-case online-learning error for smooth functions on is finite by showing iff and , and derives a sharp bound , unifying prior results. It further shows that restricting to polynomials does not ease learning, i.e., via Weierstrass approximation and weighted interpolation, linking general smoothness to polynomial approximants. The paper extends the model to a noisy feedback setting, proving finiteness iff and establishing for , with a tight discarding-round requirement. Together, these results complete the finite/finite-bounds landscape for online learning of smooth functions and illuminate the role of noise in adversarial feedback, while suggesting avenues for extending the framework to broader function families via approximation theory.

Abstract

Online learning is a model of machine learning where the learner is trained on sequential feedback. We investigate worst-case error for the online learning of real functions that have certain smoothness constraints. Suppose that is the class of all absolutely continuous functions such that , and is the best possible upper bound on the sum of the powers of absolute prediction errors for any number of trials guaranteed by any learner. We show that for any , . Combined with the previous results of Kimber and Long (1995) and Geneson and Zhou (2023), we achieve a complete characterization of the values of that result in being finite, a problem open for nearly 30 years. We study the learning scenarios of smooth functions that also belong to certain special families of functions, such as polynomials. We prove a conjecture by Geneson and Zhou (2023) that it is not any easier to learn a polynomial in than it is to learn any general function in . We also define a noisy model for the online learning of smooth functions, where the learner may receive incorrect feedback up to times, denoting the worst-case error bound as . We prove that is finite if and only if is. Moreover, we prove for all and that .

Paper Structure

This paper contains 11 sections, 33 theorems, 69 equations.

Key Result

Lemma 1.1

Let $S = \{(u_1,v_1),\ldots,(u_m,v_m) \}$ be a set of $m$ points with $(u_i, v_i) \in [0, 1] \times \mathbb{R}$ for each $i$, such that $u_1 < \ldots < u_m$. Then, for any $q \ge 1$ and any absolutely continuous function $f: [0,1] \to \mathbb R$ with $f(u_i)=v_i$ for $1 \le i \le m$, we have $J_q[f]

Theorems & Definitions (52)

  • Lemma 1.1: kl, geneson
  • Theorem 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Theorem 1.5
  • Theorem 1.6
  • Theorem 1.7
  • Lemma 2.1
  • proof
  • Lemma 2.2
  • ...and 42 more