On the Entropy of General Mixture Distributions

Namyoon Lee

On the Entropy of General Mixture Distributions

Namyoon Lee

TL;DR

This work addresses the challenge of computing the differential entropy of general mixture distributions, which is intractable due to the log-sum structure. By recasting mixtures as a memoryless channel with input $C$ and output $X$, it yields the exact decomposition $h(X)=h(X|C)+I(X;C)$ and establishes universal bounds $h(X|C)\le h(X)\le h(X|C)+H(C)$. It then provides a deterministic, closed-form approximation ${\hat{h}}(X)=h_{\sf L}(X)+{\bar{\Delta}}$ based on pairwise overlaps $z_{c,d}$, with a family-dependent offset $\bar{\Delta}$ calibrated to be exact in complete and zero overlap regimes and clipped to respect bounds. The authors derive explicit closed-form ingredients for time-honored mixture families (Gaussian, factorized Laplacian, uniform, and hybrids) and validate the method numerically across separation, dimension, component count, and correlation, demonstrating accurate and computationally tractable entropy estimation for practical applications.

Abstract

Mixture distributions are a workhorse model for multimodal data in information theory, signal processing, and machine learning. Yet even when each component density is simple, the differential entropy of the mixture is notoriously hard to compute because the mixture couples a logarithm with a sum. This paper develops a deterministic, closed-form toolkit for bounding and accurately approximating mixture entropy directly from component parameters. Our starting point is an information-theoretic channel viewpoint: the latent mixture label plays the role of an input, and the observation is the output. This viewpoint separates mixture entropy into an average within-component uncertainty plus an overlap term that quantifies how much the observation reveals about the hidden label. We then bound and approximate this overlap term using pairwise overlap integrals between component densities, yielding explicit expressions whenever these overlaps admit a closed form. A simple, family-dependent offset corrects the systematic bias of the Jensen overlap bound and is calibrated to be exact in the two limiting regimes of complete overlap and near-perfect separation. A final clipping step guarantees that the estimate always respects universal information-theoretic bounds. Closed-form specializations are provided for Gaussian, factorized Laplacian, uniform, and hybrid mixtures, and numerical experiments validate the resulting bounds and approximations across separation, dimension, number of components, and correlated covariances.

On the Entropy of General Mixture Distributions

TL;DR

and output

, it yields the exact decomposition

and establishes universal bounds

. It then provides a deterministic, closed-form approximation

based on pairwise overlaps

, with a family-dependent offset

calibrated to be exact in complete and zero overlap regimes and clipped to respect bounds. The authors derive explicit closed-form ingredients for time-honored mixture families (Gaussian, factorized Laplacian, uniform, and hybrids) and validate the method numerically across separation, dimension, component count, and correlation, demonstrating accurate and computationally tractable entropy estimation for practical applications.

Abstract

Paper Structure (25 sections, 7 theorems, 87 equations, 5 figures, 1 table)

This paper contains 25 sections, 7 theorems, 87 equations, 5 figures, 1 table.

Introduction
Related work
Contributions
Organization
Universal Bounds via Decomposition
Entropy Decomposition
Analysis of the Gap
Universal Approximation
Useful Lemmas
A Tight Approximation in Closed Form
Implications of Theorem \ref{['thm:tight_approx']}
Examples for Time-Honored Mixture Families
Example 1: Gaussian mixture
Example 2: Factorized Laplacian mixture
Example 3: Uniform mixture on measurable sets
...and 10 more sections

Key Result

Lemma 1

Let $X\in\mathbb{R}^n$ have mixture density eq:mix_pdf with latent label $C$. Then the differential entropy decomposes as where $I(X;C)$ is the mutual information (in bits) between continuous $X$ and discrete $C$.

Figures (5)

Figure 1: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(2,8)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
Figure 2: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(8,8)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
Figure 3: Comparison of $h(X)$ for various mixture distributions when $(n,K)=(32,32)$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
Figure 4: Correlated Gaussian mixtures when $(n,K)=(2,2)$ and correlation $\rho\in\{0,0.2,0.4,0.6,0.8,0.9\}$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.
Figure 5: Correlated Gaussian mixtures when $(n,K)=(8,8)$ and correlation $\rho\in\{0,0.2,0.4,0.6,0.8,0.9\}$: MC entropy, sandwich bounds, and clipped approximation versus mean separation.

Theorems & Definitions (15)

Lemma 1: Entropy decomposition
proof
Theorem 1: Universal label-sandwich bounds
proof
Proposition 1: Tightness conditions for the label-sandwich bounds
proof
Remark 1: Non-degenerate mixtures
Lemma 2: Jensen/overlap lower bound
proof
Lemma 3: Rényi--2 (collision) entropy of a density
...and 5 more

On the Entropy of General Mixture Distributions

TL;DR

Abstract

On the Entropy of General Mixture Distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (15)