The radius of statistical efficiency

Joshua Cutler; Mateo Díaz; Dmitriy Drusvyatskiy

The radius of statistical efficiency

Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy

TL;DR

This work introduces the radius of statistical efficiency (RSE), a robustness measure that quantifies how close a statistical estimation problem is to ill-posedness by the smallest Wasserstein-2 perturbation to the data distribution that makes the Fisher-information-like matrix singular. It establishes a reciprocal relationship between RSE and the intrinsic statistical difficulty captured by the minimal eigenvalue of the Fisher-information-like object, paralleling classical Eckart–Young-type results in numerical linear algebra. The authors develop a general theory based on tilt-stable minimizers and slope-based error bounds, and compute RSE with numerical constants for PCA, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Across these problems, RSE quantifies well-posed neighborhoods and yields insight into how problem structure and data perturbations govern estimation difficulty, providing a unified, spectral-measure framework for robustness in statistics. The framework also introduces spectral-function techniques for measures, linking infinite-dimensional problems to tractable matrix-function computations via Wasserstein/Bures reductions.

Abstract

Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: the radius of statistical efficiency (RSE) is the size of the smallest perturbation to the problem data that renders the Fisher information matrix singular. We compute RSE up to numerical constants for a variety of testbed problems, including principal component analysis, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Interestingly, we observe a precise reciprocal relationship between RSE and the intrinsic complexity/sensitivity of the problem instance, paralleling the classical Eckart-Young theorem in numerical analysis. To establish our results, we develop theory for spectral functions of measures that extends well-known results from matrix analysis and eigenvalue optimization$-$a contribution that may be of interest beyond our immediate findings.

The radius of statistical efficiency

TL;DR

Abstract

a contribution that may be of interest beyond our immediate findings.

Paper Structure (50 sections, 22 theorems, 218 equations, 1 table)

This paper contains 50 sections, 22 theorems, 218 equations, 1 table.

Introduction
Linear regression.
Principal component analysis (PCA).
Outline
Related work
Local minimax lower bounds in estimation.
Radius theorems.
Conditioning and radius theorems in recovery problems.
Error bounds.
Notation
Linear algebra.
Probability theory.
The distance to ill-conditioned problems
Lagrangian characterization.
Intrinsic characterization.
...and 35 more sections

Key Result

Proposition 2.1

Fix a set $\mathcal{Q}'\subset\mathcal{Q}$ and suppose that for any sequence of measures $\nu_i\in \mathcal{Q}'\setminus \mathcal{E}$ the implication holds: Then for any measure $\mu\in \mathcal{Q}'\setminus \mathcal{E}$ and any radius $0<r<{\rm RSE}(\mu)$, we have Moreover, if for some $c,q>0$, the inequality ${\rm RSE}(\nu)^q\leq c\cdot {\rm REG}(\nu)^{-1}$ holds for all $\nu \in \mathcal{Q}'\

Theorems & Definitions (39)

Proposition 2.1: RSE as a robustness measure
Lemma 3.1
Lemma 3.2
Lemma 4.1
Lemma 5.1
Theorem 5.2
Lemma 5.3
Lemma 5.4
Lemma A.1: $W_p$ metric & projections
proof
...and 29 more

The radius of statistical efficiency

TL;DR

Abstract

The radius of statistical efficiency

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (39)