Inference on Gaussian mixture models with dependent labels
Seunghyun Lee, Rajarshi Mukherjee, Sumit Mukherjee
TL;DR
This work analyzes estimation in Gaussian mixture models with latent, potentially dependent labels. It establishes a universal $\sqrt{n}$-rate estimator that remains efficient under broad dependence by using a misspecified iid likelihood, and it characterizes information-theoretic limits when latent labels follow an Ising model, revealing a phase transition at $\beta=1$. In weak dependence ($\beta\le 1$) the iid-based estimator is optimal, while strong dependence ($\beta>1$) calls for a mean-field variational estimator $\hat{\boldsymbol{\theta}}^{\text{MF}}_n$ that achieves a smaller asymptotic variance $I_{\beta}(\boldsymbol{\theta}_0)^{-1}$ (with a CW-lattice basis). The paper also discusses unknown dependence strength, partial remedies, and connections to Hidden Markov Random Fields (HMRFs), offering directions for future study on more components, high dimensions, and non-mean-field graphs. Overall, it provides sharp asymptotic efficiency results and practical estimators for dependent-labeled Gaussian mixtures, clarifying when dependence helps or hinders inference.
Abstract
Gaussian mixture models are widely used to model data generated from multiple latent sources. Despite its popularity, most theoretical research assumes that the labels are either independent and identically distributed, or follows a Markov chain. It remains unclear how the fundamental limits of estimation change under more complex dependence. In this paper, we address this question for the spherical two-component Gaussian mixture model. We first show that for labels with an arbitrary dependence, a naive estimator based on the misspecified likelihood is $\sqrt{n}$-consistent. Additionally, under labels that follow an Ising model, we establish the information theoretic limitations for estimation, and discover an interesting phase transition as dependence becomes stronger. When the dependence is smaller than a threshold, the optimal estimator and its limiting variance exactly matches the independent case, for a wide class of Ising models. On the other hand, under stronger dependence, estimation becomes easier and the naive estimator is no longer optimal. Hence, we propose an alternative estimator based on the variational approximation of the likelihood, and argue its optimality under a specific Ising model.
