Mutual information and task-relevant latent dimensionality
Paarth Gulati, Eslam Abdelaleem, Audrey Sederberg, Ilya Nemenman
TL;DR
This work tackles the challenge of identifying the task-relevant latent dimensionality required to predict outcomes from high-dimensional observations. It casts the problem as a symmetric information bottleneck and shows that conventional neural MI estimators with separable or bilinear critics inflate the inferred dimension. To address this, the authors introduce a hybrid critic that preserves an explicit bottleneck while enabling nonlinear cross-view interactions, enabling reliable, one-shot estimation of the effective latent dimensionality via a cross-covariance participation ratio. The method remains robust to observation noise and extends to intrinsic dimensionality by view splitting, with successful validation on synthetic benchmarks and physics datasets such as the 2D Ising model and pendulum dynamics. Overall, the approach provides a practical, data-efficient tool for uncovering meaningful latent structure in noisy scientific data and offers new insight into the geometry of task-relevant representations.
Abstract
Estimating the dimensionality of the latent representation needed for prediction -- the task-relevant dimension -- is a difficult, largely unsolved problem with broad scientific applications. We cast it as an Information Bottleneck question: what embedding bottleneck dimension is sufficient to compress predictor and predicted views while preserving their mutual information (MI). This repurposes neural MI estimators for dimensionality estimation. We show that standard neural estimators with separable/bilinear critics systematically inflate the inferred dimension, and we address this by introducing a hybrid critic that retains an explicit dimensional bottleneck while allowing flexible nonlinear cross-view interactions, thereby preserving the latent geometry. We further propose a one-shot protocol that reads off the effective dimension from a single over-parameterized hybrid model, without sweeping over bottleneck sizes. We validate the approach on synthetic problems with known task-relevant dimension. We extend the approach to intrinsic dimensionality by constructing paired views of a single dataset, enabling comparison with classical geometric dimension estimators. In noisy regimes where those estimators degrade, our approach remains reliable. Finally, we demonstrate the utility of the method on multiple physics datasets.
