On the Properties and Estimation of Pointwise Mutual Information Profiles
Paweł Czyż, Frederic Grabowski, Julia E. Vogt, Niko Beerenwinkel, Alexander Marx
TL;DR
This work introduces the pointwise mutual information (PMI) profile, the distribution of PMI$(X,Y)$ whose mean equals the mutual information $\mathbf{I}(X;Y)$, and proves its invariance to reparametrizations. It derives an analytic PMI profile for multivariate normal distributions and, to overcome limitations of existing benchmarks, defines Bend and Mix Models (BMMs) that concatenate bending via diffeomorphisms with mixing via mixtures to enable unbiased Monte Carlo estimation of both the PMI profile and $\mathbf{I}(X;Y)$. The authors demonstrate BMMs as effective tools for constructing expressive benchmarks, analyzing estimator robustness to inliers and outliers, and evaluating neural critics in variational MI estimators, while also enabling model-based Bayesian MI estimation with uncertainty quantification. The framework supports principled uncertainty-aware MI inference in problems with domain knowledge and provides actionable guidance for benchmarking, estimator selection, and reliability in MI estimation.
Abstract
The pointwise mutual information profile, or simply profile, is the distribution of pointwise mutual information for a given pair of random variables. One of its important properties is that its expected value is precisely the mutual information between these random variables. In this paper, we analytically describe the profiles of multivariate normal distributions and introduce a novel family of distributions, Bend and Mix Models, for which the profile can be accurately estimated using Monte Carlo methods. We then show how Bend and Mix Models can be used to study the limitations of existing mutual information estimators, investigate the behavior of neural critics used in variational estimators, and understand the effect of experimental outliers on mutual information estimation. Finally, we show how Bend and Mix Models can be used to obtain model-based Bayesian estimates of mutual information, suitable for problems with available domain expertise in which uncertainty quantification is necessary.
