Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Fan Chen; Dylan J. Foster; Yanjun Han; Jian Qian; Alexander Rakhlin; Yunbei Xu

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, Yunbei Xu

TL;DR

This work develops a unifying framework for information-theoretic lower bounds in Interactive Statistical Decision Making (ISDM), bridging classical estimation techniques (Fano, Le Cam, Assouad) with interactive methods based on the Decision-Estimation Coefficient (DEC). It introduces the interactive Fano method, which employs ghost data through a reference distribution to produce quantile-based lower bounds and to recover traditional bounds and DEC-based bounds in a single framework. A new complexity measure, the fractional covering number, is defined to capture estimation difficulty and to provide a complete bandit learnability characterization for convex model classes, enabling tightened bounds and dualities with packing concepts. The results yield new upper and lower bounds for bandit learnability, connect to Yang–Barron and local packing bounds, and extend to structured and contextual bandits, offering a unified, estimation-aware perspective on interactive decision making with practical implications for bandit and reinforcement learning theory.

Abstract

We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making. Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms that collect data interactively (e.g., algorithms for bandits and reinforcement learning). Recent work of Foster et al. (2021, 2023) provides minimax lower bounds for interactive decision making using seemingly different analysis techniques from the classical methods. These results -- which are proven using a complexity measure known as the \emph{Decision-Estimation Coefficient} (DEC) -- capture difficulties unique to interactive learning, yet do not recover the tightest known lower bounds for passive estimation. We propose a unified view of these distinct methodologies through a new lower bound approach called \emph{interactive Fano method}. As an application, we introduce a novel complexity measure, the \emph{Fractional Covering Number}, which facilitates the new lower bounds for interactive decision making that extend the DEC methodology by incorporating the complexity of estimation. Using the fractional covering number, we (i) provide a unified characterization of learnability for \emph{any} stochastic bandit problem, (ii) close the remaining gap between the upper and lower bounds in Foster et al. (2021, 2023) (up to polynomial factors) for any interactive decision making problem in which the underlying model class is convex.

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

TL;DR

Abstract

Paper Structure (88 sections, 44 theorems, 265 equations, 2 algorithms)

This paper contains 88 sections, 44 theorems, 265 equations, 2 algorithms.

Introduction
Contributions
Interactive lower bound framework (\ref{['sec:general-lower-bounds']}).
Fractional covering number and bandit learnability (\ref{['sec:logp']}).
Preliminaries
Statistical Estimation and Interactive Decision Making
Interactive Statistical Decision Making.
Statistical estimation
Interactive decision making
Background on Lower Bound Techniques
Minimax bounds for statistical estimation.
Lower bounds for interactive learning.
Decision-Estimation Coefficient.
Additional related work.
A General Lower Bound
...and 73 more sections

Key Result

Proposition 1

Consider the statistical estimation setting (sec:statistical-estimation) with parameter space $\Theta$. Suppose that there exist $\theta_1,\dots, \theta_m\in\Theta$ such that the following separation condition holds: Let $\mu$ be the uniform distribution over $\{\theta_1,\cdots,\theta_m\}$, and let $I_{\mu}(\theta;Y)$ denote the mutual information of $(\theta,Y)\sim \mathbb{P}_\mu$ generated by $

Theorems & Definitions (74)

Example 1: Mean estimation
Example 2: Functional estimation
Example 3: Density estimation
Example 4: Reward maximization
Proposition 1: Classical Fano method
Theorem 2: Interactive Fano method
Proposition 3: Recovering the generalized Fano method
proof : Proof of \ref{['coro generalized Fano']}
Proposition 4: Recovering Le Cam's convex hull method
proof : Proof of \ref{['lem:mix-vs-mix']}
...and 64 more

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

TL;DR

Abstract

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (74)