Table of Contents
Fetching ...

The radius of statistical efficiency

Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy

TL;DR

This work introduces the radius of statistical efficiency (RSE), a robustness measure that quantifies how close a statistical estimation problem is to ill-posedness by the smallest Wasserstein-2 perturbation to the data distribution that makes the Fisher-information-like matrix singular. It establishes a reciprocal relationship between RSE and the intrinsic statistical difficulty captured by the minimal eigenvalue of the Fisher-information-like object, paralleling classical Eckart–Young-type results in numerical linear algebra. The authors develop a general theory based on tilt-stable minimizers and slope-based error bounds, and compute RSE with numerical constants for PCA, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Across these problems, RSE quantifies well-posed neighborhoods and yields insight into how problem structure and data perturbations govern estimation difficulty, providing a unified, spectral-measure framework for robustness in statistics. The framework also introduces spectral-function techniques for measures, linking infinite-dimensional problems to tractable matrix-function computations via Wasserstein/Bures reductions.

Abstract

Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: the radius of statistical efficiency (RSE) is the size of the smallest perturbation to the problem data that renders the Fisher information matrix singular. We compute RSE up to numerical constants for a variety of testbed problems, including principal component analysis, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Interestingly, we observe a precise reciprocal relationship between RSE and the intrinsic complexity/sensitivity of the problem instance, paralleling the classical Eckart-Young theorem in numerical analysis. To establish our results, we develop theory for spectral functions of measures that extends well-known results from matrix analysis and eigenvalue optimization$-$a contribution that may be of interest beyond our immediate findings.

The radius of statistical efficiency

TL;DR

This work introduces the radius of statistical efficiency (RSE), a robustness measure that quantifies how close a statistical estimation problem is to ill-posedness by the smallest Wasserstein-2 perturbation to the data distribution that makes the Fisher-information-like matrix singular. It establishes a reciprocal relationship between RSE and the intrinsic statistical difficulty captured by the minimal eigenvalue of the Fisher-information-like object, paralleling classical Eckart–Young-type results in numerical linear algebra. The authors develop a general theory based on tilt-stable minimizers and slope-based error bounds, and compute RSE with numerical constants for PCA, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Across these problems, RSE quantifies well-posed neighborhoods and yields insight into how problem structure and data perturbations govern estimation difficulty, providing a unified, spectral-measure framework for robustness in statistics. The framework also introduces spectral-function techniques for measures, linking infinite-dimensional problems to tractable matrix-function computations via Wasserstein/Bures reductions.

Abstract

Classical results in asymptotic statistics show that the Fisher information matrix controls the difficulty of estimating a statistical model from observed data. In this work, we introduce a companion measure of robustness of an estimation problem: the radius of statistical efficiency (RSE) is the size of the smallest perturbation to the problem data that renders the Fisher information matrix singular. We compute RSE up to numerical constants for a variety of testbed problems, including principal component analysis, generalized linear models, phase retrieval, bilinear sensing, and matrix completion. Interestingly, we observe a precise reciprocal relationship between RSE and the intrinsic complexity/sensitivity of the problem instance, paralleling the classical Eckart-Young theorem in numerical analysis. To establish our results, we develop theory for spectral functions of measures that extends well-known results from matrix analysis and eigenvalue optimizationa contribution that may be of interest beyond our immediate findings.
Paper Structure (50 sections, 22 theorems, 218 equations, 1 table)

This paper contains 50 sections, 22 theorems, 218 equations, 1 table.

Key Result

Proposition 2.1

Fix a set $\mathcal{Q}'\subset\mathcal{Q}$ and suppose that for any sequence of measures $\nu_i\in \mathcal{Q}'\setminus \mathcal{E}$ the implication holds: Then for any measure $\mu\in \mathcal{Q}'\setminus \mathcal{E}$ and any radius $0<r<{\rm RSE}(\mu)$, we have Moreover, if for some $c,q>0$, the inequality ${\rm RSE}(\nu)^q\leq c\cdot {\rm REG}(\nu)^{-1}$ holds for all $\nu \in \mathcal{Q}'\

Theorems & Definitions (39)

  • Proposition 2.1: RSE as a robustness measure
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 4.1
  • Lemma 5.1
  • Theorem 5.2
  • Lemma 5.3
  • Lemma 5.4
  • Lemma A.1: $W_p$ metric & projections
  • proof
  • ...and 29 more