Learning Density Evolution from Snapshot Data
Rentian Yao, Atsushi Nitanda, Xiaohui Chen, Yun Yang
TL;DR
The paper tackles learning the time-evolving density of a stochastic process from noisy snapshot data by casting it as a distribution-on-scalar regression and introducing an entropy-regularized nonparametric MLE (E-NPMLE). It combines a negative log-likelihood term, entropic optimal transport smoothing, and self-entropy regularization to estimate the marginal densities across multiple time points, yielding almost dimension-free convergence rates and a phase-transition phenomenon in the (m, N) regime. To compute the estimator efficiently, the authors develop coordinate KL divergence gradient descent (CKLGD), a grid-free algorithm that exploits joint linear convexity to achieve a polynomial convergence rate, and they extend it to an inexact version with provable guarantees. Numerical experiments on synthetic data and real single-cell datasets corroborate the theoretical rates and demonstrate practical efficacy in reconstructing the density-flow map t -> R_t in arbitrary dimensions. The work advances both the theory and computation of density evolution estimation from noisy observations, with direct implications for trajectory inference and related stochastic-process learning tasks.
Abstract
Motivated by learning dynamical structures from static snapshot data, this paper presents a distribution-on-scalar regression approach for estimating the density evolution of a stochastic process from its noisy temporal point clouds. We propose an entropy-regularized nonparametric maximum likelihood estimator (E-NPMLE), which leverages the entropic optimal transport as a smoothing regularizer for the density flow. We show that the E-NPMLE has almost dimension-free statistical rates of convergence to the ground truth distributions, which exhibit a striking phase transition phenomenon in terms of the number of snapshots and per-snapshot sample size. To efficiently compute the E-NPMLE, we design a novel particle-based and grid-free coordinate KL divergence gradient descent (CKLGD) algorithm and prove its polynomial iteration complexity. Moreover, we provide numerical evidence on synthetic data to support our theoretical findings. This work contributes to the theoretical understanding and practical computation of estimating density evolution from noisy observations in arbitrary dimensions.
