An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Yanjun Han; Abhishek Shetty; Jacob Shkrob

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Yanjun Han, Abhishek Shetty, Jacob Shkrob

Abstract

Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary arbitrarily across observations. While a simple linear estimator with known variances attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via Chebyshev approximations on a transformed polynomial basis. This approach yields an improved polylogarithmic, rather than polynomial, dependence on the variance ratio, which could be of independent interest.

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Abstract

Paper Structure (48 sections, 16 theorems, 89 equations, 3 figures)

This paper contains 48 sections, 16 theorems, 89 equations, 3 figures.

Introduction
Main results
Outline of the proof
Upper bound on the covering number
Upper bounding Hellinger modulus of continuity
Related work
Heteroskedastic mean estimation.
Robust statistics and breakdown point.
Empirical Bayes and compound decision theory.
Density estimation and sieve MLE theory.
Modulus of continuity and minimax lower bounds.
Computation.
Density Estimation: Proof of Lemma
Entropic Upper Bound of the MLE in the Compound Setting
Improved Covering Results for the Normal Scale Mixture
...and 33 more sections

Key Result

Theorem 1.1

Let $\sigma_i\in [\sigma_{\min}, \sigma_{\max}]$ for all $i\in [n]$. With probability at least $1-\delta$, the estimator $\widehat{\mu}^{\mathrm{EB}}$ in eq:EB_MLE achieves (the exact logarithmic factor is displayed in lemma:density_estimation) where $\omega_{H^2,G_n}(t)$ is the Hellinger modulus of continuity in the location family:

Figures (3)

Figure 1: Average absolute estimation errors for $\widehat{\mu}^{\mathrm{EB}}$, sample median, and iterative truncation over $N=150$ simulations, under the subset-of-signals prior $G_n=\frac{m}{n}\text{\rm Unif}([0.7,1]) + \frac{n-m}{n}\text{\rm Unif}([1,150])$, with different choices of $(m,n)$.
Figure 2: Average absolute estimation errors for $\widehat{\mu}^{\mathrm{EB}}$, sample median, and two oracle MLEs over $N=150$ simulations, under two-point and three-point scale mixture priors.
Figure 3: Average absolute estimation error for $\widehat{\mu}^{\text{EB}}$, sample median, and two oracle MLEs over $N=150$ simulations, under the equal variance model where $P_1 = \dots = P_n = \mathcal{N}(0,5)$ and the quadratic variance model $P_i = \mathcal{N}(0, \frac{i^2}{10})$ for $i\in [n]$.

Theorems & Definitions (23)

Theorem 1.1
Corollary 1.2
Theorem 1.3
Theorem 1.4
Remark 1.5
Lemma 1.6
Definition 3.1: Bracketing Number
Lemma 3.2
Theorem 3.3
Lemma 3.4
...and 13 more

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Abstract

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (23)