Table of Contents
Fetching ...

Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization

Romane Giard, Yohann de Castro, Clément Marteau

TL;DR

This work develops a convex Beurling-LASSO approach for Gaussian mixtures with unknown diagonal covariances by lifting parameters to the space of Radon measures and using a normalization that yields a kernel-based, non-translation-invariant framework. It proves non-asymptotic recovery guarantees via non-degenerate dual certificates built within a Fisher-Rao geometric framework, introducing a semi-distance aligned with the kernel to control near/far regions and parameter recovery. The results yield near-parametric convergence rates for component means, diagonals, and weights, along with density prediction guarantees, and they establish sparsity of the BLASSO solution under large samples under a Non-Degenerate Source Condition. The methodology advances beyond prior BLASSO off-the-grid work by accommodating component-specific diagonal covariances and by deriving explicit separation conditions tied to the kernel geometry, with implications for accurate mixture learning and density estimation in higher dimensions.

Abstract

This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework that promotes sparsity in the space of measures, to simultaneously estimate the number of components and their parameters. Our main contribution extends the BLASSO methodology to multivariate GMMs with component-specific unknown diagonal covariance matrices-a significantly more flexible setting than previous approaches requiring known and identical covariances. We establish non-asymptotic recovery guarantees with nearly parametric convergence rates for component means, diagonal covariances, and weights, as well as for density prediction. A key theoretical contribution is the identification of an explicit separation condition on mixture components that enables the construction of non-degenerate dual certificates-essential tools for establishing statistical guarantees for the BLASSO. Our analysis leverages the Fisher-Rao geometry of the statistical model and introduces a novel semi-distance adapted to our framework, providing new insights into the interplay between component separation, parameter space geometry, and achievable statistical recovery.

Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization

TL;DR

This work develops a convex Beurling-LASSO approach for Gaussian mixtures with unknown diagonal covariances by lifting parameters to the space of Radon measures and using a normalization that yields a kernel-based, non-translation-invariant framework. It proves non-asymptotic recovery guarantees via non-degenerate dual certificates built within a Fisher-Rao geometric framework, introducing a semi-distance aligned with the kernel to control near/far regions and parameter recovery. The results yield near-parametric convergence rates for component means, diagonals, and weights, along with density prediction guarantees, and they establish sparsity of the BLASSO solution under large samples under a Non-Degenerate Source Condition. The methodology advances beyond prior BLASSO off-the-grid work by accommodating component-specific diagonal covariances and by deriving explicit separation conditions tied to the kernel geometry, with implications for accurate mixture learning and density estimation in higher dimensions.

Abstract

This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework that promotes sparsity in the space of measures, to simultaneously estimate the number of components and their parameters. Our main contribution extends the BLASSO methodology to multivariate GMMs with component-specific unknown diagonal covariance matrices-a significantly more flexible setting than previous approaches requiring known and identical covariances. We establish non-asymptotic recovery guarantees with nearly parametric convergence rates for component means, diagonal covariances, and weights, as well as for density prediction. A key theoretical contribution is the identification of an explicit separation condition on mixture components that enables the construction of non-degenerate dual certificates-essential tools for establishing statistical guarantees for the BLASSO. Our analysis leverages the Fisher-Rao geometry of the statistical model and introduces a novel semi-distance adapted to our framework, providing new insights into the interplay between component separation, parameter space geometry, and achievable statistical recovery.

Paper Structure

This paper contains 76 sections, 47 theorems, 350 equations, 1 figure.

Key Result

Theorem 1.1

Assume that the particles $(t_j^0,u_j^0)$ of $\mu^0$ are sufficiently separated, where the minimal separation constraint only depends on the dimension $d$, the sparsity index $s$, bounds on the variance and choice of a smoothing parameter $\tau \leq u_{\min}$. Choosing as regularization parameter $\ and

Figures (1)

  • Figure 1: Schematic representation for Gaussian mixture models in dimension $d=1$. Both parameters $u$ (standard deviation) and $t$ (mean) are in dimension $1$, resulting in a 2-dimensional plot in location space $(t,u)$. The discs represent the near regions, shown schematically: these regions correspond to balls defined with respect to a semi-distance, not the Euclidean distance. The hatched area corresponds to the far region.

Theorems & Definitions (99)

  • Remark 1.1
  • Theorem 1.1: Recovery guarantees for the estimation of $\mu_\omega^0$, informal result
  • Remark 1.2
  • Theorem 1.2: Recovery guarantees for the prediction of $f^0$, informal results
  • Theorem 1.3: Sparsity of the estimator for a large sample size, informal result
  • Definition 2.1: Radon measure on $A \subset \mathbb{R}^p$
  • Remark 2.1
  • Definition 3.1: Near and far regions
  • Definition 3.2: Global non-degenerate certificate
  • Definition 3.3: Local non-degenerate certificates
  • ...and 89 more