Table of Contents
Fetching ...

Online learning of smooth functions on $\mathbb{R}$

Jesse Geneson, Kuldeep Singh, Alexander Wang

Abstract

We study adversarial online learning of real-valued functions on $\mathbb{R}$. In each round the learner is queried at $x_t\in\mathbb{R}$, predicts $\hat y_t$, and then observes the true value $f(x_t)$; performance is measured by cumulative $p$-loss $\sum_{t\ge 1}|\hat y_t-f(x_t)|^p$. For the class \[ \mathcal{G}_q=\Bigl\{f:\mathbb{R}\to\mathbb{R}\ \text{absolutely continuous}:\ \int_{\mathbb{R}}|f'(x)|^q\,dx\le 1\Bigr\}, \] we show that the standard model becomes ill-posed on $\mathbb{R}$: for every $p\ge 1$ and $q>1$, an adversary can force infinite loss. Motivated by this obstruction, we analyze three modified learning scenarios that limit the influence of queries that are far from previously observed inputs. In Scenario 1 the adversary must choose each new query within distance $1$ of some past query. In Scenario 2 the adversary may query anywhere, but the learner is penalized only on rounds whose query lies within distance $1$ of a past query. In Scenario 3 the loss in round $t$ is multiplied by a weight $g(\min_{j<t}|x_t-x_j|)$. We obtain sharp characterizations for Scenarios 1-2 in several regimes. For Scenario 3 we identify a clean threshold phenomenon: if $g$ decays too slowly, then the adversary can force infinite weighted loss. In contrast, for rapidly decaying weights such as $g(z)=e^{-cz}$ we obtain finite and sharp guarantees in the quadratic case $p=q=2$. Finally, we study a natural multivariable slice generalization $\mathcal{G}_{q,d}$ of $\mathcal{G}_q$ on $\mathbb{R}^d$ and show a sharp dichotomy: while the one-dimensional case admits finite opt-values in certain regimes, for every $d\ge 2$ the slice class $\mathcal{G}_{q,d}$ is too permissive, and even under Scenarios 1-3 an adversary can force infinite loss.

Online learning of smooth functions on $\mathbb{R}$

Abstract

We study adversarial online learning of real-valued functions on . In each round the learner is queried at , predicts , and then observes the true value ; performance is measured by cumulative -loss . For the class we show that the standard model becomes ill-posed on : for every and , an adversary can force infinite loss. Motivated by this obstruction, we analyze three modified learning scenarios that limit the influence of queries that are far from previously observed inputs. In Scenario 1 the adversary must choose each new query within distance of some past query. In Scenario 2 the adversary may query anywhere, but the learner is penalized only on rounds whose query lies within distance of a past query. In Scenario 3 the loss in round is multiplied by a weight . We obtain sharp characterizations for Scenarios 1-2 in several regimes. For Scenario 3 we identify a clean threshold phenomenon: if decays too slowly, then the adversary can force infinite weighted loss. In contrast, for rapidly decaying weights such as we obtain finite and sharp guarantees in the quadratic case . Finally, we study a natural multivariable slice generalization of on and show a sharp dichotomy: while the one-dimensional case admits finite opt-values in certain regimes, for every the slice class is too permissive, and even under Scenarios 1-3 an adversary can force infinite loss.

Paper Structure

This paper contains 14 sections, 40 theorems, 150 equations.

Key Result

Proposition 2.1

Let $1\le q<r<\infty$. On $\mathbb{R}$, none of the derivative-norm classes $\mathcal{G}_q$ are nested. More precisely, all of the following hold: $\blacktriangleleft$$\blacktriangleleft$

Theorems & Definitions (81)

  • Proposition 2.1
  • proof
  • proof
  • Theorem 2.3
  • proof
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Corollary 3.3
  • ...and 71 more