Zero-shot Imputation with Foundation Inference Models for Dynamical Systems
Patrick Seifner, Kostadin Cvejoski, Antonia Körner, Ramsés J. Sánchez
TL;DR
This work tackles imputing missing values in time series whose dynamics are governed by unknown ODEs. It introduces Foundation Inference Model (FIM), a two-part framework consisting of a synthetic data generator for ODE solutions and a pretrained recognition model that maps observations to latent initial conditions $x(0)$ and time derivatives $\dot{x}(t)$, enabling zero-shot imputation by integrating the inferred dynamics. The authors validate FIM across phase portraits, 63 autonomous ODEs, and high-dimensional real-world and simulated datasets, often outperforming baselines trained on the target data. The approach leverages amortized inference and neural operators to produce a domain-agnostic imputation method with strong generalization capabilities and practical applicability to diverse dynamical systems. Limitations relate to residual mismatch between synthetic priors and certain real-world dynamics, motivating future work on broader function families and zero-shot forecasting.
Abstract
Dynamical systems governed by ordinary differential equations (ODEs) serve as models for a vast number of natural and social phenomena. In this work, we offer a fresh perspective on the classical problem of imputing missing time series data, whose underlying dynamics are assumed to be determined by ODEs. Specifically, we revisit ideas from amortized inference and neural operators, and propose a novel supervised learning framework for zero-shot time series imputation, through parametric functions satisfying some (hidden) ODEs. Our proposal consists of two components. First, a broad probability distribution over the space of ODE solutions, observation times and noise mechanisms, with which we generate a large, synthetic dataset of (hidden) ODE solutions, along with their noisy and sparse observations. Second, a neural recognition model that is trained offline, to map the generated time series onto the spaces of initial conditions and time derivatives of the (hidden) ODE solutions, which we then integrate to impute the missing data. We empirically demonstrate that one and the same (pretrained) recognition model can perform zero-shot imputation across 63 distinct time series with missing values, each sampled from widely different dynamical systems. Likewise, we demonstrate that it can perform zero-shot imputation of missing high-dimensional data in 10 vastly different settings, spanning human motion, air quality, traffic and electricity studies, as well as Navier-Stokes simulations -- without requiring any fine-tuning. What is more, our proposal often outperforms state-of-the-art methods, which are trained on the target datasets. Our pretrained model, repository and tutorials are available online.
