Table of Contents
Fetching ...

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

Paul Liautaud, Pierre Gaillard, Olivier Wintenberger

TL;DR

This work develops a parameter-free online nonparametric regression method based on chaining trees to achieve minimax regret against Hölder-continuous competitors, while also enabling local adaptivity to unknown regularities via adaptive pruning. A second locally adaptive algorithm extends these ideas, combining core trees with multiple local CTs to exploit local Hölder constants and loss curvature, including exp-concave losses. The authors prove minimax-optimal bounds and local adaptivity results, analyze computational complexity, and discuss extensions toward boosting and unknown input spaces, supported by experiments. Collectively, the methods yield the first computationally efficient online regression algorithms that are both minimax-optimal and locally adaptive in adversarial environments, with practical implications for robust nonparametric regression and potential boosting-inspired developments.

Abstract

We study adversarial online nonparametric regression with general convex losses and propose a parameter-free learning algorithm that achieves minimax optimal rates. Our approach leverages chaining trees to compete against H{ö}lder functions and establishes optimal regret bounds. While competing with nonparametric function classes can be challenging, they often exhibit local patterns - such as local H{ö}lder continuity - that online algorithms can exploit. Without prior knowledge, our method dynamically tracks and adapts to different H{ö}lder profiles by pruning a core chaining tree structure, aligning itself with local smoothness variations. This leads to the first computationally efficient algorithm with locally adaptive optimal rates for online regression in an adversarial setting. Finally, we discuss how these notions could be extended to a boosting framework, offering promising directions for future research.

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

TL;DR

This work develops a parameter-free online nonparametric regression method based on chaining trees to achieve minimax regret against Hölder-continuous competitors, while also enabling local adaptivity to unknown regularities via adaptive pruning. A second locally adaptive algorithm extends these ideas, combining core trees with multiple local CTs to exploit local Hölder constants and loss curvature, including exp-concave losses. The authors prove minimax-optimal bounds and local adaptivity results, analyze computational complexity, and discuss extensions toward boosting and unknown input spaces, supported by experiments. Collectively, the methods yield the first computationally efficient online regression algorithms that are both minimax-optimal and locally adaptive in adversarial environments, with practical implications for robust nonparametric regression and potential boosting-inspired developments.

Abstract

We study adversarial online nonparametric regression with general convex losses and propose a parameter-free learning algorithm that achieves minimax optimal rates. Our approach leverages chaining trees to compete against H{ö}lder functions and establishes optimal regret bounds. While competing with nonparametric function classes can be challenging, they often exhibit local patterns - such as local H{ö}lder continuity - that online algorithms can exploit. Without prior knowledge, our method dynamically tracks and adapts to different H{ö}lder profiles by pruning a core chaining tree structure, aligning itself with local smoothness variations. This leads to the first computationally efficient algorithm with locally adaptive optimal rates for online regression in an adversarial setting. Finally, we discuss how these notions could be extended to a boosting framework, offering promising directions for future research.
Paper Structure (37 sections, 5 theorems, 87 equations, 8 figures, 1 table)

This paper contains 37 sections, 5 theorems, 87 equations, 8 figures, 1 table.

Key Result

Theorem 1

Let $T \geqslant 1, (\mathcal{T}, \bar{\mathcal{X}}, \bar{\mathcal{W}}_1)$ be a CT with $\mathcal{X}_{\mathrm{root}(\mathcal{T})} = \mathcal{X}$, $\theta_{n,1} = 0$ for all $n \in \mathcal{N}(\mathcal{T})$ and $\mathrm{d}(\mathcal{T}) = \frac{1}{d} \log_2 T$. Then, Algorithm alg:training_CT applied for any $L >0$ and $\alpha \in (0,1]$, where $\Phi(u) = |2^{u} - 1|^{-1}$.

Figures (8)

  • Figure 1: Example of a CT over $\mathcal{X} \subset \mathbb{R}$.
  • Figure 2: Training CT $\mathcal{T}$ at time $t \geqslant 1$
  • Figure 3: Locally Adaptive Online Regression
  • Figure 4: Example of a core tree $\mathcal{T}_0$ with depth $\mathrm{d}(\mathcal{T}_0)=3$, $d=1$, in Fig. \ref{['fig:maintree']}. We give 2 pruned tree instances $\mathcal{T}_1$ for a given Lipschitz function $f_1$ in Fig. \ref{['fig:prun1']} and $\mathcal{T}_2$ for a second profile $f_2$ in Fig. \ref{['fig:prun2']}. In Fig. \ref{['fig:maintree']} all nodes $\mathcal{N}(\mathcal{T}_0)$ are awaken and predictive while $\mathcal{T}_1$ in Fig. \ref{['fig:prun1']} (resp. $\mathcal{T}_2$ in Fig. \ref{['fig:prun2']}) predicts with $\hat{f}_{2,k_2}, \hat{f}_{3,k_3}$ sitting in its leaves $\mathcal{L}(\mathcal{T}_1)$ (resp. with $\hat{f}_{2,k_2}, \hat{f}_{6,k_6}, \hat{f}_{7,k_7}$ sitting in its leaves $\mathcal{L}(\mathcal{T}_2)$). ✗ represents a pruned node.
  • Figure 5: Boosting at time $t$.
  • ...and 3 more figures

Theorems & Definitions (7)

  • Definition 1: Chaining-Tree
  • Theorem 1
  • Definition 2: Pruning
  • Theorem 2
  • Corollary 1
  • Theorem 3
  • Corollary 2