Accurate Estimation of Mutual Information in High Dimensional Data
Eslam Abdelaleem, K. Michael Martini, Ilya Nemenman
TL;DR
This work tackles the challenge of accurately estimating the mutual information $I(X;Y)$ in high-dimensional, undersampled settings. It proposes a practical estimation protocol with explicit consistency checks, confidence intervals, and a generalized critic family including probabilistic VSIB variants, enabling reliable inference even when data exhibit complex nonlinear dependencies. The authors show that reliable MI estimation is achievable when the dependence is captured in a low-dimensional latent space, the critic is expressive enough, and the dataset sufficiently samples that latent structure, aided by max-test stopping and subsampling-extrapolation techniques. Across synthetic benchmarks and a real-world MNIST-based case, the methodology matches or surpasses existing estimators while providing quantified uncertainty, thereby broadening the practical applicability of neural MI estimators in scientific research.
Abstract
Mutual information (MI) is a fundamental measure of statistical dependence between two variables, yet accurate estimation from finite data remains notoriously difficult. No estimator is universally reliable, and common approaches fail in the high-dimensional, undersampled regimes typical of modern experiments. Recent machine learning-based estimators show promise, but their accuracy depends sensitively on dataset size, structure, and hyperparameters, with no accepted tests to detect failures. We close these gaps through a systematic evaluation of classical and neural MI estimators across standard benchmarks and new synthetic datasets tailored to challenging high-dimensional, undersampled regimes. We contribute: (i) a practical protocol for reliable MI estimation with explicit checks for statistical consistency; (ii) confidence intervals (error bars around estimates) that existing neural MI estimator do not provide; and (iii) a new class of probabilistic critics designed for high-dimensional, high-information settings. We demonstrate the effectiveness of our protocol with computational experiments, showing that it consistently matches or surpasses existing methods while uniquely quantifying its own reliability. We show that reliable MI estimation is sometimes achievable even in severely undersampled, high-dimensional datasets, provided they admit accurate low-dimensional representations. This broadens the scope of applicability of neural MI estimators and clarifies when such estimators can be trusted.
