The Limits of Inference in Complex Systems: When Stochastic Models Become Indistinguishable
Javier Aguilar, Miguel A. Muñoz, Sandro Azaele
TL;DR
The paper addresses the challenge of inferring parameters and discriminating among stochastic models from sparse time-series data when analytical likelihoods are unavailable. It develops a path-based Monte Carlo framework that combines full-path statistics with bridge processes and Radon–Nikodym derivatives to compute propagators and perform maximum-likelihood inference, even with coarse sampling. A key contribution is the bridge-change-of-measure estimator, which reduces bias and variance in propagator estimation and yields principled guidelines for optimal sampling times and dataset sizes. The framework is validated across diverse domains, revealing sharp, resolution-dependent limits on model distinguishability and emphasizing the importance of experimental design to maximize information under real-world constraints. Together, these results provide a practical toolkit for robust inference and principled measurement design in complex stochastic systems.
Abstract
Robust inference for stochastic dynamical systems is often hampered by sparse sampling and the absence of closed-form likelihoods. We introduce a Monte Carlo path-inference framework that leverages full-path statistics and bridge processes to deliver reliable parameter estimation and model selection from coarsely sampled time series, without requiring analytical solutions. Crucially, we couple mechanistic stochastic models with their inference procedures to quantify how experimental design -specifically, sampling frequency and dataset size- governs estimator precision and model distinguishability. This analysis reveals optimal sampling regimes and sharp, resolution-dependent limits beyond which competing models become empirically indistinguishable. We validate the approach across four disparate systems -trajectories of optically trapped particles, human microbiome dynamics, social-media topic mentions, and forest population time series- recovering parameters and identifying when inference is fundamentally constrained by measurement resolution, thereby clarifying ongoing debates about dominant noise sources in these systems. Together, these results establish path-based Monte Carlo as a practical, general tool for inference and model discrimination in complex systems and provide principled guidelines for designing measurements that maximize information under real-world constraints.
