Table of Contents
Fetching ...

Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence

Mark Chiu Chong, Hien Duy Nguyen, TrungTin Nguyen

TL;DR

This work introduces the $h$-lifted KL divergence $KL_h$ to generalize KL-based risk bounds for finite mixtures on compact domains, accommodating densities that may vanish. It defines the maximum $h$-lifted likelihood estimator (h-MLLE) and proves oracle-like risk bounds of the form $\mathbb{E}\{KL_h(f||f_{k,n})\}-KL_h(f||\mathcal{C}) \le c_1/(k+2) + c_2/\sqrt{n}$, with a dimension-safe, complexity-dependent term that can be controlled via covering numbers; under a Lipschitz condition the bound simplifies further. The $KL_h$ framework is shown to be a Bregman divergence, bounded for general continuous densities and related to $L_p$ distances, enabling robust analysis without strict positivity assumptions. The authors provide a practical MM algorithm to compute h-MLLEs and present beta-mixture experiments demonstrating the predicted rates and the elbow phenomenon, supporting the theory and showing practical viability for density estimation on compact domains. Overall, the paper offers a theoretically grounded and computation-friendly approach to mixture density estimation that extends classical KL-based results to broader density classes.

Abstract

We consider the problem of estimating probability density functions based on sample data, using a finite mixture of densities from some component class. To this end, we introduce the $h$-lifted Kullback--Leibler (KL) divergence as a generalization of the standard KL divergence and a criterion for conducting risk minimization. Under a compact support assumption, we prove an $\mathcal{O}(1/{\sqrt{n}})$ bound on the expected estimation error when using the $h$-lifted KL divergence, which extends the results of Rakhlin et al. (2005, ESAIM: Probability and Statistics, Vol. 9) and Li and Barron (1999, Advances in Neural Information ProcessingSystems, Vol. 12) to permit the risk bounding of density functions that are not strictly positive. We develop a procedure for the computation of the corresponding maximum $h$-lifted likelihood estimators ($h$-MLLEs) using the Majorization-Maximization framework and provide experimental results in support of our theoretical bounds.

Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence

TL;DR

This work introduces the -lifted KL divergence to generalize KL-based risk bounds for finite mixtures on compact domains, accommodating densities that may vanish. It defines the maximum -lifted likelihood estimator (h-MLLE) and proves oracle-like risk bounds of the form , with a dimension-safe, complexity-dependent term that can be controlled via covering numbers; under a Lipschitz condition the bound simplifies further. The framework is shown to be a Bregman divergence, bounded for general continuous densities and related to distances, enabling robust analysis without strict positivity assumptions. The authors provide a practical MM algorithm to compute h-MLLEs and present beta-mixture experiments demonstrating the predicted rates and the elbow phenomenon, supporting the theory and showing practical viability for density estimation on compact domains. Overall, the paper offers a theoretically grounded and computation-friendly approach to mixture density estimation that extends classical KL-based results to broader density classes.

Abstract

We consider the problem of estimating probability density functions based on sample data, using a finite mixture of densities from some component class. To this end, we introduce the -lifted Kullback--Leibler (KL) divergence as a generalization of the standard KL divergence and a criterion for conducting risk minimization. Under a compact support assumption, we prove an bound on the expected estimation error when using the -lifted KL divergence, which extends the results of Rakhlin et al. (2005, ESAIM: Probability and Statistics, Vol. 9) and Li and Barron (1999, Advances in Neural Information ProcessingSystems, Vol. 12) to permit the risk bounding of density functions that are not strictly positive. We develop a procedure for the computation of the corresponding maximum -lifted likelihood estimators (-MLLEs) using the Majorization-Maximization framework and provide experimental results in support of our theoretical bounds.
Paper Structure (31 sections, 19 theorems, 138 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 19 theorems, 138 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Proposition 2

Let $\mathcal{P}$ be defined as in P_def. $\mathrm{KL}_h \left(f\,||\,g\right)$ is bounded for all continuous densities $f,g \in \mathcal{P}$.

Figures (2)

  • Figure 1: Simulation target densities $f_{1}$ (solid line) and $f_{2}$ (dashed line).
  • Figure 2: Average negative log $h$-lifted likelihood values by sample sizes $n$ and numbers of components $k$ for experiments E1 and E2.

Theorems & Definitions (32)

  • Definition 1: $h$-lifted KL divergence
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Remark 4
  • Theorem 5
  • Corollary 6
  • Remark 7
  • Lemma 8
  • ...and 22 more