Table of Contents
Fetching ...

Modifications of the BIC for order selection in finite mixture models

Hien Duy Nguyen, TrungTin Nguyen

TL;DR

This work addresses order selection in finite mixture models by weakening the regularity requirements of the Bayesian information criterion (BIC) through two refinements: the ν-BIC and ε-BIC penalties. These penalties $pen^{\nu}_{k,n}=\alpha(k)n^{-1} Ln^{\circ\nu}(n)\log n$ and $pen^{\varepsilon}_{k,n}=\alpha(k)n^{-1}(\log n)^{1+\varepsilon}$ preserve near-BIC behavior while enabling consistency under milder conditions, and a misspecification result shows that when the true density lies outside the candidate family, any vanishing-penalty IC converges to a Kullback–Leibler (KL) optimal order among candidates. The paper develops a GC-empirical-process framework with Lipschitz-enveloped component densities to prove the main consistency results, and demonstrates applicability to Gaussian, Laplace, Student-$t$, and regression mixtures, including a conditional-likelihood extension for mixtures of regression. Numerical experiments compare ε-BIC, ν-BIC, BIC, AIC, and PanIC, showing ε-BIC and ν-BIC closely track BIC in practice while delivering theoretical guarantees in nonregular settings; the results also reveal a tension between order consistency and minimax Hellinger risk. Overall, the proposed penalties offer robust, theoretically justified tools for order selection in finite mixtures, with clear guidance for practical calibration and several avenues for future work.

Abstract

Finite mixture models are ubiquitous in modern statistical modeling, and a recurring practical issue is choosing the model order. In \citet[Sankhyā Series A, \textbf62, pp. 49--66]{keribin2000consistent}, the Bayesian information criterion (BIC) was proved consistent in mixtures, but under strong regularity, including high moments and high-order derivatives of the component density. We introduce the $ν$-BIC and $ε$-BIC, which weight the BIC penalty by negligibly small logarithmic factors immaterial in practice. This minor modification yields consistency under substantially weaker conditions, without differentiability and with mild moment assumptions, and we also give a misspecification result: when the truth lies outside the candidate family, any vanishing-penalty IC eventually selects a Kullback--Leibler optimal order among candidates. Finally, we clarify two limitations of consistent IC-based selection in mixtures: there is no universally minimal BIC-scale penalty within our sufficient conditions, and order consistency can conflict with minimax optimality in Hellinger risk. We illustrate the theory for Gaussian mixtures, non-differentiable Laplace mixtures, heavy-tailed $t$-mixtures, and mixtures of regression models.

Modifications of the BIC for order selection in finite mixture models

TL;DR

This work addresses order selection in finite mixture models by weakening the regularity requirements of the Bayesian information criterion (BIC) through two refinements: the ν-BIC and ε-BIC penalties. These penalties and preserve near-BIC behavior while enabling consistency under milder conditions, and a misspecification result shows that when the true density lies outside the candidate family, any vanishing-penalty IC converges to a Kullback–Leibler (KL) optimal order among candidates. The paper develops a GC-empirical-process framework with Lipschitz-enveloped component densities to prove the main consistency results, and demonstrates applicability to Gaussian, Laplace, Student-, and regression mixtures, including a conditional-likelihood extension for mixtures of regression. Numerical experiments compare ε-BIC, ν-BIC, BIC, AIC, and PanIC, showing ε-BIC and ν-BIC closely track BIC in practice while delivering theoretical guarantees in nonregular settings; the results also reveal a tension between order consistency and minimax Hellinger risk. Overall, the proposed penalties offer robust, theoretically justified tools for order selection in finite mixtures, with clear guidance for practical calibration and several avenues for future work.

Abstract

Finite mixture models are ubiquitous in modern statistical modeling, and a recurring practical issue is choosing the model order. In \citet[Sankhyā Series A, \textbf62, pp. 49--66]{keribin2000consistent}, the Bayesian information criterion (BIC) was proved consistent in mixtures, but under strong regularity, including high moments and high-order derivatives of the component density. We introduce the -BIC and -BIC, which weight the BIC penalty by negligibly small logarithmic factors immaterial in practice. This minor modification yields consistency under substantially weaker conditions, without differentiability and with mild moment assumptions, and we also give a misspecification result: when the truth lies outside the candidate family, any vanishing-penalty IC eventually selects a Kullback--Leibler optimal order among candidates. Finally, we clarify two limitations of consistent IC-based selection in mixtures: there is no universally minimal BIC-scale penalty within our sufficient conditions, and order consistency can conflict with minimax optimality in Hellinger risk. We illustrate the theory for Gaussian mixtures, non-differentiable Laplace mixtures, heavy-tailed -mixtures, and mixtures of regression models.

Paper Structure

This paper contains 16 sections, 19 theorems, 218 equations, 4 tables.

Key Result

Lemma 1

If $\mathbf{X}_{n}$ is IID and ${\cal F}\subset{\cal L}_{1}\left(P\right)$ is a class of measurable functions admitting an envelope $F\in{\cal L}_{1}\left(P\right)$ and $N_{\left[\right]}\left(\delta,{\cal F},{\cal L}_{1}\left(P\right)\right)<\infty$, for every $\delta>0$, then ${\cal F}$ is a GC cl $\left({\cal G},d\right)$ is compact and $g\mapsto f_{g}\left(x\right)$ is continuous for $P$-almos

Theorems & Definitions (35)

  • Lemma 1
  • proof
  • Remark 2
  • Remark 3
  • Lemma 4
  • proof
  • Proposition 5
  • Lemma 6
  • Lemma 7
  • Theorem 8
  • ...and 25 more