Table of Contents
Fetching ...

Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

Taehyun Kim, Bodhisattva Sen

Abstract

The empirical Bayes $g$-modeling approach via the nonparametric maximum likelihood estimator (NPMLE) is widely used for large-scale estimation and inference in the normal means problem, yet theoretical guarantees for uncertainty quantification remain scarce. A key obstacle is that the NPMLE of the mixing distribution is necessarily discrete, which yields discrete posterior credible sets and a deconvolution rate that is logarithmic. We address both limitations by studying a hierarchical Gaussian smoothing layer that restricts the mixing distribution to a Gaussian location mixture. The resulting smooth NPMLE is computed by solving a convex optimization problem and inherits the near-parametric denoising performance of the classical NPMLE. For deconvolution it achieves a polynomial rate of convergence which we show is asymptotically minimax over the corresponding class. The estimated smooth posteriors converge to the true posteriors at the same polynomial rate in weighted total variation distance. When the model is misspecified, the smooth NPMLE converges to the Kullback-Leibler projection of the true marginal density onto the model class at a nearly parametric rate, and the polynomial deconvolution and posterior convergence rates carry over to this pseudo-true target. Building on this smooth posterior, we characterize optimal marginal coverage sets: the shortest set-valued rules achieving a prescribed marginal coverage probability. Plug-in empirical Bayes marginal coverage sets based on the smooth NPMLE achieve asymptotically exact coverage at a polynomial rate and converge to the oracle optimal set in expected length. All results extend to heteroscedastic Gaussian observations. We also study identifiability of the proposed model and show that the largest Gaussian component of the prior is identifiable, and provide a consistent estimator and a finite-sample upper confidence bound for it.

Empirical Bayes Estimation and Inference via Smooth Nonparametric Maximum Likelihood

Abstract

The empirical Bayes -modeling approach via the nonparametric maximum likelihood estimator (NPMLE) is widely used for large-scale estimation and inference in the normal means problem, yet theoretical guarantees for uncertainty quantification remain scarce. A key obstacle is that the NPMLE of the mixing distribution is necessarily discrete, which yields discrete posterior credible sets and a deconvolution rate that is logarithmic. We address both limitations by studying a hierarchical Gaussian smoothing layer that restricts the mixing distribution to a Gaussian location mixture. The resulting smooth NPMLE is computed by solving a convex optimization problem and inherits the near-parametric denoising performance of the classical NPMLE. For deconvolution it achieves a polynomial rate of convergence which we show is asymptotically minimax over the corresponding class. The estimated smooth posteriors converge to the true posteriors at the same polynomial rate in weighted total variation distance. When the model is misspecified, the smooth NPMLE converges to the Kullback-Leibler projection of the true marginal density onto the model class at a nearly parametric rate, and the polynomial deconvolution and posterior convergence rates carry over to this pseudo-true target. Building on this smooth posterior, we characterize optimal marginal coverage sets: the shortest set-valued rules achieving a prescribed marginal coverage probability. Plug-in empirical Bayes marginal coverage sets based on the smooth NPMLE achieve asymptotically exact coverage at a polynomial rate and converge to the oracle optimal set in expected length. All results extend to heteroscedastic Gaussian observations. We also study identifiability of the proposed model and show that the largest Gaussian component of the prior is identifiable, and provide a consistent estimator and a finite-sample upper confidence bound for it.

Paper Structure

This paper contains 42 sections, 20 theorems, 369 equations, 6 figures, 3 tables, 1 algorithm.

Key Result

Theorem 2.1

Suppose that eq:hierarchical model holds for all $i = 1, \ldots, n$ where $c_* > 0$. Recall that $\alpha_* := c_*^2 / \sigma_*^2$ (see eq:alphastar) where $\sigma_*^2=c_*^2+1$. Let ${\widehat{H}_n}$ be any solution of (eq:NPMLE). For any fixed $M \ge \sqrt{10 \sigma_*^2 \log n}$ and a nonempty, comp for all $t \ge 1$, with probability at least $1-2n^{-t^2}$. Moreover,

Figures (6)

  • Figure 1: We consider $G^* = N(-2,1)/2+N(2,1)/2$ in \ref{['eq:original model']} (equivalently, ${H^*} = \delta_{-2}/2 + \delta_{2}/2$ and $c_* = 1$ in \ref{['eq:hierarchical model']}); here $\delta_x$ denotes the Dirac delta measure at $x$. The classical (discrete) NPMLE (obtained from model \ref{['eq:original model']}) and the smooth NPMLE $g_{{\widehat{H}_n}}$ (from \ref{['eq:estimated prior density']}) are computed using $n = 1000$ observations and shown in the left and center plots along with the true prior density $g_{H^*}$. For the smooth NPMLE, $c_*$ is also estimated using the neighborhood procedure described in Section \ref{['sec:identifiability']}. The true posterior densities at $x = \pm 2$ and the estimated posterior densities based on the smooth NPMLE are shown in the rightmost plot.
  • Figure 2: The setup is the same as in Figure \ref{['fig:SNPML-TwoComp']}, except that the true prior is $G^* = \mathrm{Laplace}(0,1)$. The true marginal density of the observations and the estimated marginal density based on the smooth NPMLE are shown in the rightmost plot. For the smooth NPMLE, $c_*$ is estimated using the neighborhood procedure described in Section \ref{['sec:identifiability']}.
  • Figure 3: Empirical Bayes analysis of school-specific regression coefficients in the math scores dataset. The panels show the histogram of observations, the smooth NPMLE, estimated 95% optimal marginal coverage sets and standard frequentist confidence intervals $b_i \pm \sigma_iz_{0.975}$ along with empirical Bayes estimates.
  • Figure 4: Comparison of HPD sets (left) and optimal marginal coverage sets (center) under ${H^*} = \delta_{-2}/2 + \delta_{2}/2$ and $c_* = 3/5$ in \ref{['eq:hierarchical model']} and confidence level $1-\beta = 0.95$. The red line represents the oracle posterior mean $\mathbb{E}_{{H^*}}[\theta_i \mid X_i = x]$. (Right) Lengths of the HPD sets (solid) and those of the optimal marginal coverage sets (dashed) as functions of $x$ (the expected lengths of both methods are also noted).
  • Figure 5: The true model is ${H^*} = \delta_{-2}/2 + \delta_{2}/2$ with $c_* = c_0 = 3/5$ under the hierarchical model \ref{['eq:hierarchical model']}. (Left) Average length and (right) coverage of the estimated optimal marginal coverage sets based on the NPMLE using $n = 1000$ observations, when $c \in (0, 2)$ is used instead of $c_0$, with $\beta = 0.05$. The dashed horizontal lines represent the length and coverage of the $(1-\beta)$ standard confidence interval $X_i \pm z_{1-\beta/2}$. The dashed vertical lines represent $c_0 = 3/5$.
  • ...and 1 more figures

Theorems & Definitions (43)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Remark 2.1: Compact support
  • Theorem 2.4
  • Theorem 2.5
  • Remark 2.2: On assumption (A3)
  • Remark 2.3: Sub-Gaussianity of $p_{G^*}$
  • Theorem 3.1
  • Remark 3.1
  • ...and 33 more