Reverse Information Projections and Optimal E-statistics

Tyron Lardy; Peter Grünwald; Peter Harremoës

Reverse Information Projections and Optimal E-statistics

Tyron Lardy, Peter Grünwald, Peter Harremoës

TL;DR

The paper addresses the limitation of the reverse information projection (RIPr) in cases where the information divergence $D(P\|\mathcal{C})$ is infinite by introducing a description-gain framework $D(P\|Q\to Q')$ and a corresponding extended RIPr. It proves existence and uniqueness of a limiting measure $\hat{Q}$ that generalizes the RIPr, and shows that when $D(P\|\mathcal{C})$ is finite, this extended RIPr agrees with the classical one, while also yielding optimal e-statistics in a GRO-like sense for the extended setting. It then develops a greedy approximation algorithm to compute or approximate the RIPr in challenging cases, and analyzes convex-hull variants and approximation rates, linking these to strongest e-statistics. The results provide a robust framework for hypothesis testing with e-values under optional stopping, including practical guidance on convergence rates and when RIPr yields sub-probability measures. The work opens avenues for extending these ideas to Poissonized constructions and Rényi-based projections, broadening the applicability to broader hypothesis-testing problems.

Abstract

Information projections have found important applications in probability theory, statistics, and related areas. In the field of hypothesis testing in particular, the reverse information projection (RIPr) has recently been shown to lead to growth-rate optimal (GRO) e-statistics for testing simple alternatives against composite null hypotheses. However, the RIPr as well as the GRO criterion are undefined whenever the infimum information divergence between the null and alternative is infinite. We show that in such scenarios, under some assumptions, there still exists a measure in the null that is closest to the alternative in a specific sense. Whenever the information divergence is finite, this measure coincides with the usual RIPr. It therefore gives a natural extension of the RIPr to certain cases where the latter was previously not defined. This extended notion of the RIPr is shown to lead to optimal e-statistics in a sense that is a novel, but natural, extension of the GRO criterion. We also give conditions under which the (extension of the) RIPr is a strict sub-probability measure, as well as conditions under which an approximation of the RIPr leads to approximate e-statistics. For this case we provide tight relations between the corresponding approximation rates.

Reverse Information Projections and Optimal E-statistics

TL;DR

The paper addresses the limitation of the reverse information projection (RIPr) in cases where the information divergence

is infinite by introducing a description-gain framework

and a corresponding extended RIPr. It proves existence and uniqueness of a limiting measure

that generalizes the RIPr, and shows that when

is finite, this extended RIPr agrees with the classical one, while also yielding optimal e-statistics in a GRO-like sense for the extended setting. It then develops a greedy approximation algorithm to compute or approximate the RIPr in challenging cases, and analyzes convex-hull variants and approximation rates, linking these to strongest e-statistics. The results provide a robust framework for hypothesis testing with e-values under optional stopping, including practical guidance on convergence rates and when RIPr yields sub-probability measures. The work opens avenues for extending these ideas to Poissonized constructions and Rényi-based projections, broadening the applicability to broader hypothesis-testing problems.

Abstract

Paper Structure (20 sections, 20 theorems, 124 equations, 1 algorithm)

This paper contains 20 sections, 20 theorems, 124 equations, 1 algorithm.

Introduction
Contents and Overview
Background
Preliminaries
The Reverse Information Projection
E-statistics and Growth Rate Optimality
The Reverse Information Projection
Strict sub-probability measure
Greedy Approximation
Discussion
Optimal E-statistics
Convexity
Approximation
Related Work
Summary and Future Work
...and 5 more sections

Key Result

Theorem 1

If $P$ and all $Q\in\mathcal{C}$ are probability measures such that $D(P\| \mathcal{C})<\infty$, then there exists a unique (potentially sub-) probability measure $\hat{Q}$ such that:

Theorems & Definitions (48)

Theorem 1: Li li1999Estimation, Definition 4.2 and Theorem 4.3
Definition 1
Theorem 2: Grünwald et al. grunwald2024, Theorem 1
Proposition 1
Theorem 3
Proposition 2
Proposition 3
Example 1
Proposition 4
Example 2
...and 38 more

Reverse Information Projections and Optimal E-statistics

TL;DR

Abstract

Reverse Information Projections and Optimal E-statistics

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (48)