Latent Representation and Simulation of Markov Processes via Time-Lagged Information Bottleneck
Marco Federici, Patrick Forré, Ryota Tomioka, Bastiaan S. Veeling
TL;DR
The paper tackles the high cost of long-horizon Markov process simulation by learning latent representations that preserve dynamics at a chosen lag $\tau$ through a Time-lagged Information Bottleneck (T-IB) framework. It formalizes Latent Simulation (LS) to unfold trajectories in a latent space using encoders and variational transitions, and introduces autoinformation-based sufficiency to guarantee preservation of dynamics across timescales. By combining a non-linear, contrastive TI-Max objective with a bottleneck term, the method yields information-optimal representations that keep slow, relevant dynamics while discarding fast fluctuations, enabling accurate and dramatically faster latent simulations. Empirical results on synthetic slow-fast dynamics and molecular systems show that T-IB outperforms traditional linear or unregularized non-linear approaches in both representation quality and unfolded trajectory statistics, achieving substantial speedups over direct molecular dynamics and demonstrating practical impact for large-scale simulations.
Abstract
Markov processes are widely used mathematical models for describing dynamic systems in various fields. However, accurately simulating large-scale systems at long time scales is computationally expensive due to the short time steps required for accurate integration. In this paper, we introduce an inference process that maps complex systems into a simplified representational space and models large jumps in time. To achieve this, we propose Time-lagged Information Bottleneck (T-IB), a principled objective rooted in information theory, which aims to capture relevant temporal features while discarding high-frequency information to simplify the simulation task and minimize the inference error. Our experiments demonstrate that T-IB learns information-optimal representations for accurately modeling the statistical properties and dynamics of the original process at a selected time lag, outperforming existing time-lagged dimensionality reduction methods.
