Table of Contents
Fetching ...

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, Yunbei Xu

TL;DR

This work develops a unifying framework for information-theoretic lower bounds in Interactive Statistical Decision Making (ISDM), bridging classical estimation techniques (Fano, Le Cam, Assouad) with interactive methods based on the Decision-Estimation Coefficient (DEC). It introduces the interactive Fano method, which employs ghost data through a reference distribution to produce quantile-based lower bounds and to recover traditional bounds and DEC-based bounds in a single framework. A new complexity measure, the fractional covering number, is defined to capture estimation difficulty and to provide a complete bandit learnability characterization for convex model classes, enabling tightened bounds and dualities with packing concepts. The results yield new upper and lower bounds for bandit learnability, connect to Yang–Barron and local packing bounds, and extend to structured and contextual bandits, offering a unified, estimation-aware perspective on interactive decision making with practical implications for bandit and reinforcement learning theory.

Abstract

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms that collect data interactively (e.g., algorithms for bandits and reinforcement learning). Recent work of Foster et al. (2021, 2023) provides minimax lower bounds for interactive decision making using seemingly different analysis techniques from the classical methods. These results -- which are proven using a complexity measure known as the \emph{Decision-Estimation Coefficient} (DEC) -- capture difficulties unique to interactive learning, yet do not recover the tightest known lower bounds for passive estimation. We propose a unified view of these distinct methodologies through a new lower bound approach called \emph{interactive Fano method}. As an application, we introduce a novel complexity measure, the \emph{Fractional Covering Number}, which facilitates the new lower bounds for interactive decision making that extend the DEC methodology by incorporating the complexity of estimation. Using the fractional covering number, we (i) provide a unified characterization of learnability for \emph{any} stochastic bandit problem, (ii) close the remaining gap between the upper and lower bounds in Foster et al. (2021, 2023) (up to polynomial factors) for any interactive decision making problem in which the underlying model class is convex.

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

TL;DR

This work develops a unifying framework for information-theoretic lower bounds in Interactive Statistical Decision Making (ISDM), bridging classical estimation techniques (Fano, Le Cam, Assouad) with interactive methods based on the Decision-Estimation Coefficient (DEC). It introduces the interactive Fano method, which employs ghost data through a reference distribution to produce quantile-based lower bounds and to recover traditional bounds and DEC-based bounds in a single framework. A new complexity measure, the fractional covering number, is defined to capture estimation difficulty and to provide a complete bandit learnability characterization for convex model classes, enabling tightened bounds and dualities with packing concepts. The results yield new upper and lower bounds for bandit learnability, connect to Yang–Barron and local packing bounds, and extend to structured and contextual bandits, offering a unified, estimation-aware perspective on interactive decision making with practical implications for bandit and reinforcement learning theory.

Abstract

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms that collect data interactively (e.g., algorithms for bandits and reinforcement learning). Recent work of Foster et al. (2021, 2023) provides minimax lower bounds for interactive decision making using seemingly different analysis techniques from the classical methods. These results -- which are proven using a complexity measure known as the \emph{Decision-Estimation Coefficient} (DEC) -- capture difficulties unique to interactive learning, yet do not recover the tightest known lower bounds for passive estimation. We propose a unified view of these distinct methodologies through a new lower bound approach called \emph{interactive Fano method}. As an application, we introduce a novel complexity measure, the \emph{Fractional Covering Number}, which facilitates the new lower bounds for interactive decision making that extend the DEC methodology by incorporating the complexity of estimation. Using the fractional covering number, we (i) provide a unified characterization of learnability for \emph{any} stochastic bandit problem, (ii) close the remaining gap between the upper and lower bounds in Foster et al. (2021, 2023) (up to polynomial factors) for any interactive decision making problem in which the underlying model class is convex.
Paper Structure (88 sections, 44 theorems, 265 equations, 2 algorithms)

This paper contains 88 sections, 44 theorems, 265 equations, 2 algorithms.

Key Result

Proposition 1

Consider the statistical estimation setting (sec:statistical-estimation) with parameter space $\Theta$. Suppose that there exist $\theta_1,\dots, \theta_m\in\Theta$ such that the following separation condition holds: Let $\mu$ be the uniform distribution over $\{\theta_1,\cdots,\theta_m\}$, and let $I_{\mu}(\theta;Y)$ denote the mutual information of $(\theta,Y)\sim \mathbb{P}_\mu$ generated by $

Theorems & Definitions (74)

  • Example 1: Mean estimation
  • Example 2: Functional estimation
  • Example 3: Density estimation
  • Example 4: Reward maximization
  • Proposition 1: Classical Fano method
  • Theorem 2: Interactive Fano method
  • Proposition 3: Recovering the generalized Fano method
  • proof : Proof of \ref{['coro generalized Fano']}
  • Proposition 4: Recovering Le Cam's convex hull method
  • proof : Proof of \ref{['lem:mix-vs-mix']}
  • ...and 64 more