Table of Contents
Fetching ...

Gaussian mixtures and non-parametric likelihoods through the lens of statistical mechanics

Subhroshekhar Ghosh, Adityanand Guntuboyina, Satyaki Mukherjee, Hoang-Son Tran

Abstract

In this work, we investigate Gaussian Mixture Models ({\it abbrv} GMM) and the related problem of non parametric maximum likelihood estimation ({\it abbrv} NPMLE) from the perspective of statistical mechanics. In particular, we establish stability guarantees for the NPMLE procedure that extend well beyond the state of the art. Crucially, we obtain guarantees on the Kullback-Leibler divergence between NPMLE estimators and the ground truth, a type of result which has been known to be challenging in the literature on this problem. In particular, we provide high probability upper bounds on the KL divergence between the NPMLE and the true density that are of the order of $\min\big\{\frac{(\log n)^{d+2}}{n} , \frac{\log n}{\sqrt n}\big\}$, which cover a wide range of scenarios for the comparative sizes of $n$ and $d$. We obtain similar guarantees for approximate solutions to the NPMLE problem, addressing realistic situations wherein optimization algorithms need to be stopped in finite time, allowing access only to approximations to the true NPMLE. A technical cornerstone of our approach is an analysis of the function class complexity of logarithms of gaussian mixture densities, which is able to handle their unboundedness, and could be of wider interest. We also establish correspondences between stability phenomena in the NPMLE problem and concepts from chaos and multiple valleys in random energy landscapes of statistical mechanics models. We believe that these correspondences may be useful for a wide variety of random optimization problems in statistics and machine learning, especially the connections to the the technical ingredients of concentration phenomena and Langevin dynamics for these models.

Gaussian mixtures and non-parametric likelihoods through the lens of statistical mechanics

Abstract

In this work, we investigate Gaussian Mixture Models ({\it abbrv} GMM) and the related problem of non parametric maximum likelihood estimation ({\it abbrv} NPMLE) from the perspective of statistical mechanics. In particular, we establish stability guarantees for the NPMLE procedure that extend well beyond the state of the art. Crucially, we obtain guarantees on the Kullback-Leibler divergence between NPMLE estimators and the ground truth, a type of result which has been known to be challenging in the literature on this problem. In particular, we provide high probability upper bounds on the KL divergence between the NPMLE and the true density that are of the order of , which cover a wide range of scenarios for the comparative sizes of and . We obtain similar guarantees for approximate solutions to the NPMLE problem, addressing realistic situations wherein optimization algorithms need to be stopped in finite time, allowing access only to approximations to the true NPMLE. A technical cornerstone of our approach is an analysis of the function class complexity of logarithms of gaussian mixture densities, which is able to handle their unboundedness, and could be of wider interest. We also establish correspondences between stability phenomena in the NPMLE problem and concepts from chaos and multiple valleys in random energy landscapes of statistical mechanics models. We believe that these correspondences may be useful for a wide variety of random optimization problems in statistics and machine learning, especially the connections to the the technical ingredients of concentration phenomena and Langevin dynamics for these models.
Paper Structure (64 sections, 38 theorems, 258 equations)

This paper contains 64 sections, 38 theorems, 258 equations.

Key Result

Theorem 2.1

Let $X_1, ..., X_n$ be i.i.d. samples from a GMM density $f_* \in \mathcal{M}$ whose mixing measure $\mu_*$ has support contained in a compact set $\Theta \subset \mathbb{R}^d$. Let $(\varepsilon_n)_{n \ge 1}$ be any sequence of positive numbers and let $\tilde{f}_n \in \mathcal{M}$ be any estimator Then we have:

Theorems & Definitions (67)

  • Theorem 2.1
  • Corollary 2.2
  • Definition 2.3
  • Theorem 2.4
  • Theorem 2.5
  • Theorem 2.6: Moment bounds of $\hat{L}_n$
  • Theorem 2.7: Fluctuations in NPMLE
  • Corollary 2.8
  • Remark
  • Lemma 4.1
  • ...and 57 more