Table of Contents
Fetching ...

Haussdorff consistency of MLE in folded normal and Gaussian mixtures

Koustav Mallik

TL;DR

This work develops a unified quotient (orbit) framework for likelihood-based estimation under finite symmetries, enabling existence and consistency proofs for maximizers in nonregular settings such as the folded normal and finite Gaussian mixtures. A key methodological device is an orbit-aware deterministic argmax stability lemma, which converts uniform convergence to orbit-Hausdorff convergence of (possibly set-valued) maximizers. For the folded normal, the authors derive a complete profile analysis with boundary coercivity, establish orbit-level consistency, and uncover a nonregular $n^{1/4}$ local rate at the symmetry point $\mu_0=0$. For Gaussian mixtures, they construct compact sieves with two-sided density envelopes and responsibility-based gradient bounds, proving sieve LLNs and orbit-Hausdorff consistency, and they show that ridge-penalization yields coercivity guaranteeing existence and consistency even when penalties vanish slowly. The framework also provides practical guidance on sieve sizes, penalties, and EM-style optimization, and offers extensions to growing sieves and order selection with a focus on orbit-invariant inference.

Abstract

We develop a constant-tracking likelihood theory for two nonregular models: the folded normal and finite Gaussian mixtures. For the folded normal, we prove boundary coercivity for the profiled likelihood, show that the profile path of the location parameter exists and is strictly decreasing by an implicit-function argument, and establish a unique profile maximizer in the scale parameter. Deterministic envelopes for the log-likelihood, the score, and the Hessian yield elementary uniform laws of large numbers with finite-sample bounds, avoiding covering numbers. Identification and Kullback-Leibler separation deliver consistency. A sixth-order expansion of the log hyperbolic cosine creates a quadratic-minus-quartic contrast around zero, leading to a nonstandard one-fourth-power rate for the location estimator at the kink and a standard square-root rate for the scale estimator, with a uniform remainder bound. For finite Gaussian mixtures with distinct components and positive weights, we give a short identifiability proof up to label permutations via Fourier and Vandermonde ideas, derive two-sided Gaussian envelopes and responsibility-based gradient bounds on compact sieves, and obtain almost-sure and high-probability uniform laws with explicit constants. Using a minimum-matching distance on permutation orbits, we prove Hausdorff consistency on fixed and growing sieves. We quantify variance-collapse spikes via an explicit spike-bonus bound and show that a quadratic penalty in location and log-scale dominates this bonus, making penalized likelihood coercive; when penalties shrink but sample size times penalty diverges, penalized estimators remain consistent. All proofs are constructive, track constants, verify measurability of maximizers, and provide practical guidance for tuning sieves, penalties, and EM-style optimization.

Haussdorff consistency of MLE in folded normal and Gaussian mixtures

TL;DR

This work develops a unified quotient (orbit) framework for likelihood-based estimation under finite symmetries, enabling existence and consistency proofs for maximizers in nonregular settings such as the folded normal and finite Gaussian mixtures. A key methodological device is an orbit-aware deterministic argmax stability lemma, which converts uniform convergence to orbit-Hausdorff convergence of (possibly set-valued) maximizers. For the folded normal, the authors derive a complete profile analysis with boundary coercivity, establish orbit-level consistency, and uncover a nonregular local rate at the symmetry point . For Gaussian mixtures, they construct compact sieves with two-sided density envelopes and responsibility-based gradient bounds, proving sieve LLNs and orbit-Hausdorff consistency, and they show that ridge-penalization yields coercivity guaranteeing existence and consistency even when penalties vanish slowly. The framework also provides practical guidance on sieve sizes, penalties, and EM-style optimization, and offers extensions to growing sieves and order selection with a focus on orbit-invariant inference.

Abstract

We develop a constant-tracking likelihood theory for two nonregular models: the folded normal and finite Gaussian mixtures. For the folded normal, we prove boundary coercivity for the profiled likelihood, show that the profile path of the location parameter exists and is strictly decreasing by an implicit-function argument, and establish a unique profile maximizer in the scale parameter. Deterministic envelopes for the log-likelihood, the score, and the Hessian yield elementary uniform laws of large numbers with finite-sample bounds, avoiding covering numbers. Identification and Kullback-Leibler separation deliver consistency. A sixth-order expansion of the log hyperbolic cosine creates a quadratic-minus-quartic contrast around zero, leading to a nonstandard one-fourth-power rate for the location estimator at the kink and a standard square-root rate for the scale estimator, with a uniform remainder bound. For finite Gaussian mixtures with distinct components and positive weights, we give a short identifiability proof up to label permutations via Fourier and Vandermonde ideas, derive two-sided Gaussian envelopes and responsibility-based gradient bounds on compact sieves, and obtain almost-sure and high-probability uniform laws with explicit constants. Using a minimum-matching distance on permutation orbits, we prove Hausdorff consistency on fixed and growing sieves. We quantify variance-collapse spikes via an explicit spike-bonus bound and show that a quadratic penalty in location and log-scale dominates this bonus, making penalized likelihood coercive; when penalties shrink but sample size times penalty diverges, penalized estimators remain consistent. All proofs are constructive, track constants, verify measurability of maximizers, and provide practical guidance for tuning sieves, penalties, and EM-style optimization.

Paper Structure

This paper contains 60 sections, 42 theorems, 187 equations.

Key Result

Lemma 2.1

Let $(K,d)$ be compact and let $q:K\to\mathbb{R}$ be continuous. Define Then $Q_0$ is nonempty and compact. Fix $\varepsilon>0$. If $K\setminus B_\varepsilon(Q_0)=\varnothing$, the conclusion below is trivial. Otherwise define the gap Let $q_n:K\to\mathbb{R}$ be upper semicontinuous (in particular, continuous), and set If then Consequently, if $\sup_K|q_n-q|\to0$, then $d_H^+(Q_n,Q_0)\to0$.

Theorems & Definitions (101)

  • Lemma 2.1: Outer argmax localization on compact sets
  • proof
  • Remark 2.2: Why full Hausdorff convergence needs extra assumptions
  • Corollary 2.3: A sufficient condition for full Hausdorff convergence
  • proof
  • Remark 2.4: Orbit-level version
  • Remark 2.5: Differentiability on the Quotient
  • Lemma 3.1: Hyperbolic bounds
  • proof
  • Lemma 3.2: Deterministic lower bound for the folded-normal negative log-density
  • ...and 91 more