Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine
TL;DR
The paper tackles probabilistic inference on high‑dimensional time series by learning temporal contrastive representations whose marginal distribution is isotropic Gaussian and whose joint dynamics form a Gauss‑Markov chain. By introducing a parametrization with a learned matrix $A$, the authors show that future representations, intermediate waypoints, and full sequences admit closed‑form Gaussian posteriors, reducing planning and prediction to low‑dimensional matrix operations and even linear interpolation in special cases. Theoretical results are complemented by numerical experiments on synthetic spirals, mazes, and high‑dimensional robotic tasks (39D and 46D), demonstrating accurate inference and substantial planning gains over baselines. The work offers a practical, scalable route to inference‑driven planning in high‑dimensional time series without reconstruction, with potential impact in robotics, control, and financial time series.
Abstract
Given time series data, how can we answer questions like "what will happen in the future?" and "how did we get here?" These sorts of probabilistic inference questions are challenging when observations are high-dimensional. In this paper, we show how these questions can have compact, closed form solutions in terms of learned representations. The key idea is to apply a variant of contrastive learning to time series data. Prior work already shows that the representations learned by contrastive learning encode a probability ratio. By extending prior work to show that the marginal distribution over representations is Gaussian, we can then prove that joint distribution of representations is also Gaussian. Taken together, these results show that representations learned via temporal contrastive learning follow a Gauss-Markov chain, a graphical model where inference (e.g., prediction, planning) over representations corresponds to inverting a low-dimensional matrix. In one special case, inferring intermediate representations will be equivalent to interpolating between the learned representations. We validate our theory using numerical simulations on tasks up to 46-dimensions.
