Table of Contents
Fetching ...

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Yanjun Han, Abhishek Shetty, Jacob Shkrob

Abstract

Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary arbitrarily across observations. While a simple linear estimator with known variances attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via Chebyshev approximations on a transformed polynomial basis. This approach yields an improved polylogarithmic, rather than polynomial, dependence on the variance ratio, which could be of independent interest.

An Empirical Bayes Perspective on Heteroskedastic Mean Estimation

Abstract

Towards understanding the fundamental limits of estimation from data of varied quality, we study the problem of estimating a mean parameter from heteroskedastic Gaussian observations where the variances are unknown and may vary arbitrarily across observations. While a simple linear estimator with known variances attains the smallest mean squared error, estimation without this knowledge is challenging due to the large number of nuisance parameters. We propose a simple and principled approach based on empirical Bayes: model the observations as if they were i.i.d. from a normal scale mixture and compute the profile maximum likelihood estimator (MLE) for the mean, treating the nonparametric mixing distribution as nuisance. Our result shows that this estimator achieves near-optimal error bounds across various heteroskedastic models in the literature. In particular, for the subset-of-signals problem where an unknown subset of observations has small variance, our estimator adaptively achieves the minimax rate for all signal sizes, including the sharp phase transition, without any tuning parameters. One of our key technical steps is a sharper metric entropy bound for normal scale mixtures, obtained via Chebyshev approximations on a transformed polynomial basis. This approach yields an improved polylogarithmic, rather than polynomial, dependence on the variance ratio, which could be of independent interest.
Paper Structure (48 sections, 16 theorems, 89 equations, 3 figures)

This paper contains 48 sections, 16 theorems, 89 equations, 3 figures.

Key Result

Theorem 1.1

Let $\sigma_i\in [\sigma_{\min}, \sigma_{\max}]$ for all $i\in [n]$. With probability at least $1-\delta$, the estimator $\widehat{\mu}^{\mathrm{EB}}$ in eq:EB_MLE achieves (the exact logarithmic factor is displayed in lemma:density_estimation) where $\omega_{H^2,G_n}(t)$ is the Hellinger modulus of continuity in the location family:

Figures (3)

  • Figure 1: Average absolute estimation errors for $\widehat{\mu}^{\mathrm{EB}}$, sample median, and iterative truncation over $N=150$ simulations, under the subset-of-signals prior $G_n=\frac{m}{n}\text{\rm Unif}([0.7,1]) + \frac{n-m}{n}\text{\rm Unif}([1,150])$, with different choices of $(m,n)$.
  • Figure 2: Average absolute estimation errors for $\widehat{\mu}^{\mathrm{EB}}$, sample median, and two oracle MLEs over $N=150$ simulations, under two-point and three-point scale mixture priors.
  • Figure 3: Average absolute estimation error for $\widehat{\mu}^{\text{EB}}$, sample median, and two oracle MLEs over $N=150$ simulations, under the equal variance model where $P_1 = \dots = P_n = \mathcal{N}(0,5)$ and the quadratic variance model $P_i = \mathcal{N}(0, \frac{i^2}{10})$ for $i\in [n]$.

Theorems & Definitions (23)

  • Theorem 1.1
  • Corollary 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Remark 1.5
  • Lemma 1.6
  • Definition 3.1: Bracketing Number
  • Lemma 3.2
  • Theorem 3.3
  • Lemma 3.4
  • ...and 13 more