Table of Contents
Fetching ...

Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior

Yury Polyanskiy, Mark Sellke

TL;DR

This paper analyzes the nonparametric maximum likelihood estimator (NPMLE) for Gaussian location mixtures in one dimension, showing that while the NPMLE can have up to $n$ atoms, generic random data yield a well-behaved, certifiably computable structure. A key contribution is proving almost-sure strictness of Lindsay’s stationarity conditions, absolute continuity of the NPMLE’s law on the $k$-atom manifold, and near-optimal local landscape properties that imply linear EM convergence and a locally quadratic Newton convergence when near the true solution. The authors introduce a certified computation framework: an $\\varepsilon$-grid approximation plus atom-merging yields a certifiable Shub–Smale approximate NPMLE with provable Wasserstein error $O_X(\\varepsilon^{1/4})$ and a method to exactly certify the atom count, plus a finite-time algorithm to obtain these guarantees. They also extend the discussion to static-support NPMLE with analogous certifiable guarantees and show that in higher dimensions ($d\ge 2$) the NPMLE can exhibit unbounded support size, highlighting qualitative differences beyond the one-dimensional setting.

Abstract

We study the nonparametric maximum likelihood estimator $\widehatπ$ for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an $n$-point dataset, this estimator always returns a mixture with at most $n$ components, and more recently (Wu-Polyanskiy, 2020) gave a sharp $O(\log n)$ bound for subgaussian data. In this work we study computational aspects of $\widehatπ$. We provide an algorithm which for small enough $\varepsilon>0$ computes an $\varepsilon$-approximation of $\widehatπ$ in Wasserstein distance in time $K+Cnk^2\log\log(1/\varepsilon)$. Here $K$ is data-dependent but independent of $\varepsilon$, while $C$ is an absolute constant and $k=|supp(\widehatπ)|\leq n$ is the number of atoms in $\widehatπ$. We also certifiably compute the exact value of $|supp(\widehatπ)|$ in finite time. These guarantees hold almost surely whenever the dataset $(x_1,\dots,x_n)\in [-cn^{1/4},cn^{1/4}]$ consists of independent points from a probability distribution with a density (relative to Lebesgue measure). We also show the distribution of $\widehatπ$ conditioned to be $k$-atomic admits a density on the associated $2k-1$ dimensional parameter space for all $k\leq \sqrt{n}/3$, and almost sure locally linear convergence of the EM algorithm. One key tool is a classical Fourier analytic estimate for non-degenerate curves.

Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior

TL;DR

This paper analyzes the nonparametric maximum likelihood estimator (NPMLE) for Gaussian location mixtures in one dimension, showing that while the NPMLE can have up to atoms, generic random data yield a well-behaved, certifiably computable structure. A key contribution is proving almost-sure strictness of Lindsay’s stationarity conditions, absolute continuity of the NPMLE’s law on the -atom manifold, and near-optimal local landscape properties that imply linear EM convergence and a locally quadratic Newton convergence when near the true solution. The authors introduce a certified computation framework: an -grid approximation plus atom-merging yields a certifiable Shub–Smale approximate NPMLE with provable Wasserstein error and a method to exactly certify the atom count, plus a finite-time algorithm to obtain these guarantees. They also extend the discussion to static-support NPMLE with analogous certifiable guarantees and show that in higher dimensions () the NPMLE can exhibit unbounded support size, highlighting qualitative differences beyond the one-dimensional setting.

Abstract

We study the nonparametric maximum likelihood estimator for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an -point dataset, this estimator always returns a mixture with at most components, and more recently (Wu-Polyanskiy, 2020) gave a sharp bound for subgaussian data. In this work we study computational aspects of . We provide an algorithm which for small enough computes an -approximation of in Wasserstein distance in time . Here is data-dependent but independent of , while is an absolute constant and is the number of atoms in . We also certifiably compute the exact value of in finite time. These guarantees hold almost surely whenever the dataset consists of independent points from a probability distribution with a density (relative to Lebesgue measure). We also show the distribution of conditioned to be -atomic admits a density on the associated dimensional parameter space for all , and almost sure locally linear convergence of the EM algorithm. One key tool is a classical Fourier analytic estimate for non-degenerate curves.

Paper Structure

This paper contains 26 sections, 50 theorems, 161 equations.

Key Result

Proposition 1.1

For any $\pi$ and $\pi'$, denoting $\pi_t=(1-t)\pi+t\pi'$, we have The minimizer $\widehat{\pi}$ of eq:NPMLE-def is unique and $k$-atomic for some $k\leq n$, and satisfies for all $y\in{\rm supp}(\widehat{\pi})$: Moreover $D_{\widehat{\pi},X}(y)\leq 1$ for all $y\in{\mathbb{R}}$, so that

Theorems & Definitions (99)

  • Proposition 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Proposition 1.4: polyanskiy2020self
  • Corollary 1.5
  • Remark 1.6
  • Theorem 1.7
  • Definition 1
  • Proposition 1.8
  • proof
  • ...and 89 more