MMG: Mutual Information Estimation via the MMSE Gap in Diffusion
Longxuan Yu, Xing Shi, Xianghao Kong, Tong Jia, Greg Ver Steeg
TL;DR
MMG redefines mutual information estimation through the integrated MMSE gap between conditional and unconditional denoisers in diffusion models, linking I(x;y) to a half-area under the MMSE gap across SNRs. It introduces adaptive importance sampling to target informative SNR ranges and an orthogonal principle to stabilize the MI integrand, delivering state-of-the-art performance on MI benchmarks and strong reliability in high-MI regimes. The approach avoids gradient-based score estimation, relying instead on denoising objectives, and is released as a unified PyTorch library for future side-by-side comparisons. Overall, MMG provides a scalable, robust, diffusion-based MI estimator with practical impact for measuring relationships in complex systems.
Abstract
Mutual information (MI) is one of the most general ways to measure relationships between random variables, but estimating this quantity for complex systems is challenging. Denoising diffusion models have recently set a new bar for density estimation, so it is natural to consider whether these methods could also be used to improve MI estimation. Using the recently introduced information-theoretic formulation of denoising diffusion models, we show the diffusion models can be used in a straightforward way to estimate MI. In particular, the MI corresponds to half the gap in the Minimum Mean Square Error (MMSE) between conditional and unconditional diffusion, integrated over all Signal-to-Noise-Ratios (SNRs) in the noising process. Our approach not only passes self-consistency tests but also outperforms traditional and score-based diffusion MI estimators. Furthermore, our method leverages adaptive importance sampling to achieve scalable MI estimation, while maintaining strong performance even when the MI is high.
