Minimax-optimal and Locally-adaptive Online Nonparametric Regression

Paul Liautaud; Pierre Gaillard; Olivier Wintenberger

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

Paul Liautaud, Pierre Gaillard, Olivier Wintenberger

TL;DR

This work develops a parameter-free online nonparametric regression method based on chaining trees to achieve minimax regret against Hölder-continuous competitors, while also enabling local adaptivity to unknown regularities via adaptive pruning. A second locally adaptive algorithm extends these ideas, combining core trees with multiple local CTs to exploit local Hölder constants and loss curvature, including exp-concave losses. The authors prove minimax-optimal bounds and local adaptivity results, analyze computational complexity, and discuss extensions toward boosting and unknown input spaces, supported by experiments. Collectively, the methods yield the first computationally efficient online regression algorithms that are both minimax-optimal and locally adaptive in adversarial environments, with practical implications for robust nonparametric regression and potential boosting-inspired developments.

Abstract

We study adversarial online nonparametric regression with general convex losses and propose a parameter-free learning algorithm that achieves minimax optimal rates. Our approach leverages chaining trees to compete against H{ö}lder functions and establishes optimal regret bounds. While competing with nonparametric function classes can be challenging, they often exhibit local patterns - such as local H{ö}lder continuity - that online algorithms can exploit. Without prior knowledge, our method dynamically tracks and adapts to different H{ö}lder profiles by pruning a core chaining tree structure, aligning itself with local smoothness variations. This leads to the first computationally efficient algorithm with locally adaptive optimal rates for online regression in an adversarial setting. Finally, we discuss how these notions could be extended to a boosting framework, offering promising directions for future research.

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

TL;DR

Abstract

Paper Structure (37 sections, 5 theorems, 87 equations, 8 figures, 1 table)

This paper contains 37 sections, 5 theorems, 87 equations, 8 figures, 1 table.

Introduction
Related work
Online nonparametric regression
Regret against $\alpha$-Hölder competitors and local adaptivity
Contributions and outline of the paper
Minimax regret with chaining trees: a parameter-free online approach
Setting and notations.
Chaining tree
First algorithm: the online training of a chaining-tree
Computation of the gradients.
Online gradient optimization subroutine.
First result.
Minimax optimality and adaptivity to $L$ and $\alpha$.
Comparison to standard adaptive OCO methods in $\mathbb{R}^{|\mathcal{N}(\mathcal{T})|}$.
Complexity.
...and 22 more sections

Key Result

Theorem 1

Let $T \geqslant 1, (\mathcal{T}, \bar{\mathcal{X}}, \bar{\mathcal{W}}_1)$ be a CT with $\mathcal{X}_{\mathrm{root}(\mathcal{T})} = \mathcal{X}$, $\theta_{n,1} = 0$ for all $n \in \mathcal{N}(\mathcal{T})$ and $\mathrm{d}(\mathcal{T}) = \frac{1}{d} \log_2 T$. Then, Algorithm alg:training_CT applied for any $L >0$ and $\alpha \in (0,1]$, where $\Phi(u) = |2^{u} - 1|^{-1}$.

Figures (8)

Figure 1: Example of a CT over $\mathcal{X} \subset \mathbb{R}$.
Figure 2: Training CT $\mathcal{T}$ at time $t \geqslant 1$
Figure 3: Locally Adaptive Online Regression
Figure 4: Example of a core tree $\mathcal{T}_0$ with depth $\mathrm{d}(\mathcal{T}_0)=3$, $d=1$, in Fig. \ref{['fig:maintree']}. We give 2 pruned tree instances $\mathcal{T}_1$ for a given Lipschitz function $f_1$ in Fig. \ref{['fig:prun1']} and $\mathcal{T}_2$ for a second profile $f_2$ in Fig. \ref{['fig:prun2']}. In Fig. \ref{['fig:maintree']} all nodes $\mathcal{N}(\mathcal{T}_0)$ are awaken and predictive while $\mathcal{T}_1$ in Fig. \ref{['fig:prun1']} (resp. $\mathcal{T}_2$ in Fig. \ref{['fig:prun2']}) predicts with $\hat{f}_{2,k_2}, \hat{f}_{3,k_3}$ sitting in its leaves $\mathcal{L}(\mathcal{T}_1)$ (resp. with $\hat{f}_{2,k_2}, \hat{f}_{6,k_6}, \hat{f}_{7,k_7}$ sitting in its leaves $\mathcal{L}(\mathcal{T}_2)$). ✗ represents a pruned node.
Figure 5: Boosting at time $t$.
...and 3 more figures

Theorems & Definitions (7)

Definition 1: Chaining-Tree
Theorem 1
Definition 2: Pruning
Theorem 2
Corollary 1
Theorem 3
Corollary 2

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

TL;DR

Abstract

Minimax-optimal and Locally-adaptive Online Nonparametric Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (7)