Table of Contents
Fetching ...

Pure Exploration with Infinite Answers

Riccardo Poiani, Martino Bernasconi, Andrea Celli

TL;DR

This work extends pure exploration to settings where the set of correct answers can be infinite, introducing a fundamental instance-dependent lower bound and showing that existing finite-answer methods may be suboptimal. It identifies the regularity conditions under which the problem is learnable and presents a unifying framework, Sticky Sequence Track-and-Stop, that achieves asymptotically optimal sample complexity by following converging sequences of candidate answers. The approach generalizes both Track-and-Stop and Sticky Track-and-Stop and provides a taxonomy of topological regimes with tailored convergence strategies, enabling regression and other continuous-answer tasks to attain optimal rates. The results enhance the theoretical foundations and practical reach of asymptotically optimal pure exploration, with implications for regression, continuous-function estimation, and complex identification tasks in bandit settings.

Abstract

We study pure exploration problems where the set of correct answers is possibly infinite, e.g., the regression of any continuous function of the means of the bandit. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general setting. Finally, we present a framework, Sticky-Sequence Track-and-Stop, which generalizes both Track-and-Stop and Sticky Track-and-Stop, and that enjoys asymptotic optimality. Due to its generality, our analysis also highlights special cases where existing methods enjoy optimality.

Pure Exploration with Infinite Answers

TL;DR

This work extends pure exploration to settings where the set of correct answers can be infinite, introducing a fundamental instance-dependent lower bound and showing that existing finite-answer methods may be suboptimal. It identifies the regularity conditions under which the problem is learnable and presents a unifying framework, Sticky Sequence Track-and-Stop, that achieves asymptotically optimal sample complexity by following converging sequences of candidate answers. The approach generalizes both Track-and-Stop and Sticky Track-and-Stop and provides a taxonomy of topological regimes with tailored convergence strategies, enabling regression and other continuous-answer tasks to attain optimal rates. The results enhance the theoretical foundations and practical reach of asymptotically optimal pure exploration, with implications for regression, continuous-function estimation, and complex identification tasks in bandit settings.

Abstract

We study pure exploration problems where the set of correct answers is possibly infinite, e.g., the regression of any continuous function of the means of the bandit. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general setting. Finally, we present a framework, Sticky-Sequence Track-and-Stop, which generalizes both Track-and-Stop and Sticky Track-and-Stop, and that enjoys asymptotic optimality. Due to its generality, our analysis also highlights special cases where existing methods enjoy optimality.

Paper Structure

This paper contains 64 sections, 56 theorems, 146 equations, 1 figure, 1 algorithm.

Key Result

Theorem 1

Suppose that $\bm\mu \mathrel{ \vcenter{ \hbox{\ialign{\cr$\mapstochar\varrightarrow$\cr $\mapstochar\varrightarrow$\cr}} } } \mathcal{X}^{\star}(\bm\mu)$ is continuous, and $\mathcal{M}$ and $\mathcal{X}$ are compact sets. Then, ass:conv holds.

Figures (1)

  • Figure 1: Even though the sets $\mathcal{X}_t$ (in blue) are progressively shrinking toward $\mathcal{X}^\star(\bm\mu)$ (in yellow), the answers selected $x_t$ could oscillate between one of the two correct answers marked by the red crosses.

Theorems & Definitions (105)

  • Theorem 1: Continuous Correspondence Implies \ref{['ass:conv']}
  • Theorem 2: Lower Bound
  • Lemma 1: Continuity
  • Definition 1: Convergent selection rule
  • Theorem 3
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 95 more