The Sample Complexity of Simple Binary Hypothesis Testing

Ankit Pensia; Varun Jog; Po-Ling Loh

The Sample Complexity of Simple Binary Hypothesis Testing

Ankit Pensia, Varun Jog, Po-Ling Loh

TL;DR

This work provides tight, finite-sample characterizations of the sample complexity for simple binary hypothesis testing under both Bayesian and prior-free formulations, linking the necessary sample size to divergences from the Jensen–Shannon and Hellinger families. A central technical contribution is an $f$-divergence inequality bridging $ ext{JS}_{oldsymbol{ abla}}(p,q)$ and $ ext{H}_{ar ext{lambda}}(p,q)$, which yields constant-factor equivalence independent of $p$, $q$, and error parameters. The authors extend the core results to distributed, robust, sequential, and erasure settings, deriving both statistical and computational implications and showing how problems with information constraints or contamination can be solved efficiently by reducing to the main Bayes PF framework. Unexpected phenomena in the weak-detection regime are uncovered, including prior-dependent and nonuniform-prior behaviors that challenge conventional asymptotic intuition, with precise bounds that cover multiple regimes. Overall, the paper delivers a cohesive, operational framework that guides test selection and algorithm design across a spectrum of hypothesis-testing scenarios.

Abstract

The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d.\ samples required to distinguish between two distributions $p$ and $q$ in either: (i) the prior-free setting, with type-I error at most $α$ and type-II error at most $β$; or (ii) the Bayesian setting, with Bayes error at most $δ$ and prior distribution $(π, 1-π)$. This problem has only been studied when $α= β$ (prior-free) or $π= 1/2$ (Bayesian), and the sample complexity is known to be characterized by the Hellinger divergence between $p$ and $q$, up to multiplicative constants. In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of $p$, $q$, and all error parameters) for: (i) all $0 \le α, β\le 1/8$ in the prior-free setting; and (ii) all $δ\le π/4$ in the Bayesian setting. In particular, the formula admits equivalent expressions in terms of certain divergences from the Jensen--Shannon and Hellinger families. The main technical result concerns an $f$-divergence inequality between members of the Jensen--Shannon and Hellinger families, which is proved by a combination of information-theoretic tools and case-by-case analyses. We explore applications of our results to (i) robust hypothesis testing, (ii) distributed (locally-private and communication-constrained) hypothesis testing, (iii) sequential hypothesis testing, and (iv) hypothesis testing with erasures.

The Sample Complexity of Simple Binary Hypothesis Testing

TL;DR

-divergence inequality bridging

and

, which yields constant-factor equivalence independent of

, and error parameters. The authors extend the core results to distributed, robust, sequential, and erasure settings, deriving both statistical and computational implications and showing how problems with information constraints or contamination can be solved efficiently by reducing to the main Bayes PF framework. Unexpected phenomena in the weak-detection regime are uncovered, including prior-dependent and nonuniform-prior behaviors that challenge conventional asymptotic intuition, with precise bounds that cover multiple regimes. Overall, the paper delivers a cohesive, operational framework that guides test selection and algorithm design across a spectrum of hypothesis-testing scenarios.

Abstract

The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d.\ samples required to distinguish between two distributions

and

in either: (i) the prior-free setting, with type-I error at most

and type-II error at most

; or (ii) the Bayesian setting, with Bayes error at most

and prior distribution

. This problem has only been studied when

(prior-free) or

(Bayesian), and the sample complexity is known to be characterized by the Hellinger divergence between

and

, up to multiplicative constants. In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of

, and all error parameters) for: (i) all

in the prior-free setting; and (ii) all

in the Bayesian setting. In particular, the formula admits equivalent expressions in terms of certain divergences from the Jensen--Shannon and Hellinger families. The main technical result concerns an

-divergence inequality between members of the Jensen--Shannon and Hellinger families, which is proved by a combination of information-theoretic tools and case-by-case analyses. We explore applications of our results to (i) robust hypothesis testing, (ii) distributed (locally-private and communication-constrained) hypothesis testing, (iii) sequential hypothesis testing, and (iv) hypothesis testing with erasures.

Paper Structure (56 sections, 51 theorems, 190 equations)

This paper contains 56 sections, 51 theorems, 190 equations.

Introduction
Our results
Sample complexity of Bayesian simple hypothesis testing
Sample complexity of prior-free simple hypothesis testing
Distributed simple binary hypothesis testing
Robust simple binary hypothesis testing
Large error probability regime: Weak detection
Related work
Preliminaries
Notation:
Problem definitions
Bayesian hypothesis testing
Prior-free hypothesis testing
Relation between these two problems
Proof of \ref{['eq:reln-bayesian-prior-free-1']}:
...and 41 more sections

Key Result

Theorem 2.1

Let $p$ and $q$ be two arbitrary discrete distributions satisfying $\mathrm{h}^2(p,q) \leq 0.125$. Let $\pi \in (0,1/2]$ be the prior parameter and $\delta \in (0,\pi/4)$ be the desired average error probability.

Theorems & Definitions (126)

Definition 1.0: Bayesian simple binary hypothesis testing
Definition 1.0: Prior-free simple binary hypothesis testing
Theorem 2.1: Bayesian simple hypothesis testing
Theorem 2.2: Prior-free simple hypothesis testing
Definition 2.3: Distributed simple binary hypothesis setting
Theorem 2.4: Statistical and computational costs of communication for hypothesis testing
Definition 2.5: Local differential privacy
Theorem 2.6: Computational costs of privacy for hypothesis testing
Definition 2.7: Robust Bayesian simple binary hypothesis testing
Corollary 2.8: Corollary of \ref{['thm:main-result-intro-bay']} and Huber65
...and 116 more

The Sample Complexity of Simple Binary Hypothesis Testing

TL;DR

Abstract

The Sample Complexity of Simple Binary Hypothesis Testing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (126)