Table of Contents
Fetching ...

On the Entropy of General Mixture Distributions

Namyoon Lee

TL;DR

This work addresses the challenge of computing the differential entropy of general mixture distributions, which is intractable due to the log-sum structure. By recasting mixtures as a memoryless channel with input $C$ and output $X$, it yields the exact decomposition $h(X)=h(X|C)+I(X;C)$ and establishes universal bounds $h(X|C)\le h(X)\le h(X|C)+H(C)$. It then provides a deterministic, closed-form approximation ${\hat{h}}(X)=h_{\sf L}(X)+{\bar{\Delta}}$ based on pairwise overlaps $z_{c,d}$, with a family-dependent offset $\bar{\Delta}$ calibrated to be exact in complete and zero overlap regimes and clipped to respect bounds. The authors derive explicit closed-form ingredients for time-honored mixture families (Gaussian, factorized Laplacian, uniform, and hybrids) and validate the method numerically across separation, dimension, component count, and correlation, demonstrating accurate and computationally tractable entropy estimation for practical applications.

Abstract

Mixture distributions are a workhorse model for multimodal data in information theory, signal processing, and machine learning. Yet even when each component density is simple, the differential entropy of the mixture is notoriously hard to compute because the mixture couples a logarithm with a sum. This paper develops a deterministic, closed-form toolkit for bounding and accurately approximating mixture entropy directly from component parameters. Our starting point is an information-theoretic channel viewpoint: the latent mixture label plays the role of an input, and the observation is the output. This viewpoint separates mixture entropy into an average within-component uncertainty plus an overlap term that quantifies how much the observation reveals about the hidden label. We then bound and approximate this overlap term using pairwise overlap integrals between component densities, yielding explicit expressions whenever these overlaps admit a closed form. A simple, family-dependent offset corrects the systematic bias of the Jensen overlap bound and is calibrated to be exact in the two limiting regimes of complete overlap and near-perfect separation. A final clipping step guarantees that the estimate always respects universal information-theoretic bounds. Closed-form specializations are provided for Gaussian, factorized Laplacian, uniform, and hybrid mixtures, and numerical experiments validate the resulting bounds and approximations across separation, dimension, number of components, and correlated covariances.

On the Entropy of General Mixture Distributions

TL;DR

This work addresses the challenge of computing the differential entropy of general mixture distributions, which is intractable due to the log-sum structure. By recasting mixtures as a memoryless channel with input and output , it yields the exact decomposition and establishes universal bounds . It then provides a deterministic, closed-form approximation based on pairwise overlaps , with a family-dependent offset calibrated to be exact in complete and zero overlap regimes and clipped to respect bounds. The authors derive explicit closed-form ingredients for time-honored mixture families (Gaussian, factorized Laplacian, uniform, and hybrids) and validate the method numerically across separation, dimension, component count, and correlation, demonstrating accurate and computationally tractable entropy estimation for practical applications.

Abstract

Mixture distributions are a workhorse model for multimodal data in information theory, signal processing, and machine learning. Yet even when each component density is simple, the differential entropy of the mixture is notoriously hard to compute because the mixture couples a logarithm with a sum. This paper develops a deterministic, closed-form toolkit for bounding and accurately approximating mixture entropy directly from component parameters. Our starting point is an information-theoretic channel viewpoint: the latent mixture label plays the role of an input, and the observation is the output. This viewpoint separates mixture entropy into an average within-component uncertainty plus an overlap term that quantifies how much the observation reveals about the hidden label. We then bound and approximate this overlap term using pairwise overlap integrals between component densities, yielding explicit expressions whenever these overlaps admit a closed form. A simple, family-dependent offset corrects the systematic bias of the Jensen overlap bound and is calibrated to be exact in the two limiting regimes of complete overlap and near-perfect separation. A final clipping step guarantees that the estimate always respects universal information-theoretic bounds. Closed-form specializations are provided for Gaussian, factorized Laplacian, uniform, and hybrid mixtures, and numerical experiments validate the resulting bounds and approximations across separation, dimension, number of components, and correlated covariances.
Paper Structure (25 sections, 7 theorems, 87 equations, 5 figures, 1 table)

This paper contains 25 sections, 7 theorems, 87 equations, 5 figures, 1 table.

Key Result

Lemma 1

Let $X\in\mathbb{R}^n$ have mixture density eq:mix_pdf with latent label $C$. Then the differential entropy decomposes as where $I(X;C)$ is the mutual information (in bits) between continuous $X$ and discrete $C$.

Figures (5)

  • Figure 1: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(2,8)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
  • Figure 2: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(8,8)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
  • Figure 3: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(32,32)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
  • Figure 4: Correlated Gaussian mixtures when $(n,K)=(2,2)$ and correlation $\rho\in\{0,0.2,0.4,0.6,0.8,0.9\}$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
  • Figure 5: Correlated Gaussian mixtures when $(n,K)=(8,8)$ and correlation $\rho\in\{0,0.2,0.4,0.6,0.8,0.9\}$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.

Theorems & Definitions (15)

  • Lemma 1: Entropy decomposition
  • proof
  • Theorem 1: Universal label-sandwich bounds
  • proof
  • Proposition 1: Tightness conditions for the label-sandwich bounds
  • proof
  • Remark 1: Non-degenerate mixtures
  • Lemma 2: Jensen/overlap lower bound
  • proof
  • Lemma 3: Rényi--2 (collision) entropy of a density
  • ...and 5 more