Table of Contents
Fetching ...

Learning Density Evolution from Snapshot Data

Rentian Yao, Atsushi Nitanda, Xiaohui Chen, Yun Yang

TL;DR

The paper tackles learning the time-evolving density of a stochastic process from noisy snapshot data by casting it as a distribution-on-scalar regression and introducing an entropy-regularized nonparametric MLE (E-NPMLE). It combines a negative log-likelihood term, entropic optimal transport smoothing, and self-entropy regularization to estimate the marginal densities across multiple time points, yielding almost dimension-free convergence rates and a phase-transition phenomenon in the (m, N) regime. To compute the estimator efficiently, the authors develop coordinate KL divergence gradient descent (CKLGD), a grid-free algorithm that exploits joint linear convexity to achieve a polynomial convergence rate, and they extend it to an inexact version with provable guarantees. Numerical experiments on synthetic data and real single-cell datasets corroborate the theoretical rates and demonstrate practical efficacy in reconstructing the density-flow map t -> R_t in arbitrary dimensions. The work advances both the theory and computation of density evolution estimation from noisy observations, with direct implications for trajectory inference and related stochastic-process learning tasks.

Abstract

Motivated by learning dynamical structures from static snapshot data, this paper presents a distribution-on-scalar regression approach for estimating the density evolution of a stochastic process from its noisy temporal point clouds. We propose an entropy-regularized nonparametric maximum likelihood estimator (E-NPMLE), which leverages the entropic optimal transport as a smoothing regularizer for the density flow. We show that the E-NPMLE has almost dimension-free statistical rates of convergence to the ground truth distributions, which exhibit a striking phase transition phenomenon in terms of the number of snapshots and per-snapshot sample size. To efficiently compute the E-NPMLE, we design a novel particle-based and grid-free coordinate KL divergence gradient descent (CKLGD) algorithm and prove its polynomial iteration complexity. Moreover, we provide numerical evidence on synthetic data to support our theoretical findings. This work contributes to the theoretical understanding and practical computation of estimating density evolution from noisy observations in arbitrary dimensions.

Learning Density Evolution from Snapshot Data

TL;DR

The paper tackles learning the time-evolving density of a stochastic process from noisy snapshot data by casting it as a distribution-on-scalar regression and introducing an entropy-regularized nonparametric MLE (E-NPMLE). It combines a negative log-likelihood term, entropic optimal transport smoothing, and self-entropy regularization to estimate the marginal densities across multiple time points, yielding almost dimension-free convergence rates and a phase-transition phenomenon in the (m, N) regime. To compute the estimator efficiently, the authors develop coordinate KL divergence gradient descent (CKLGD), a grid-free algorithm that exploits joint linear convexity to achieve a polynomial convergence rate, and they extend it to an inexact version with provable guarantees. Numerical experiments on synthetic data and real single-cell datasets corroborate the theoretical rates and demonstrate practical efficacy in reconstructing the density-flow map t -> R_t in arbitrary dimensions. The work advances both the theory and computation of density evolution estimation from noisy observations, with direct implications for trajectory inference and related stochastic-process learning tasks.

Abstract

Motivated by learning dynamical structures from static snapshot data, this paper presents a distribution-on-scalar regression approach for estimating the density evolution of a stochastic process from its noisy temporal point clouds. We propose an entropy-regularized nonparametric maximum likelihood estimator (E-NPMLE), which leverages the entropic optimal transport as a smoothing regularizer for the density flow. We show that the E-NPMLE has almost dimension-free statistical rates of convergence to the ground truth distributions, which exhibit a striking phase transition phenomenon in terms of the number of snapshots and per-snapshot sample size. To efficiently compute the E-NPMLE, we design a novel particle-based and grid-free coordinate KL divergence gradient descent (CKLGD) algorithm and prove its polynomial iteration complexity. Moreover, we provide numerical evidence on synthetic data to support our theoretical findings. This work contributes to the theoretical understanding and practical computation of estimating density evolution from noisy observations in arbitrary dimensions.

Paper Structure

This paper contains 45 sections, 24 theorems, 268 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Assume there is a constant $E > 0$ such that $E^{-1} \leq \tau D_{\hbox{\scriptsize \rm KL}} (R^\ast\,\|\,W^\tau) \leq E$, and the time step satisfies $\Delta_m := \max_j \{t_{j+1}-t_j\} = O(m^{-1})$. Let $C_\delta, C_\lambda > 0$ be two sufficiently large constants, and Then, with the choice of $\lambda=\lambda_{N,m}$, it holds with probability at least $1 - 2e^{-\frac{N\delta_{N, m}^2}{2\Delta_

Figures (3)

  • Figure 1: High-level comparison of using the MFLD algorithm and the inexact CKLGD algorithm to minimize $\mathcal{F}_{N,m}$. All solid lines represent mean-field Langevin dynamics and dotted lines represent KL divergence gradient flow. (a) The MFLD algorithm directly applies the mean-field Langevin dynamics (solid line) to compute the global minimum of $\mathcal{F}_{N, m}$. Due to the nonconvexity along geodesics, MFLD with annealing requires $O(e^{\frac{C}{\varepsilon}})$ total iterations to achieve $\varepsilon$-accuracy. (b) Inexact CKLGD discretizes the KL divergence gradient flow (dotted line) and uses MFLD (solid line) to compute each iterate. Inexact CKLGD only requires polynomial total iterations to achieve $\varepsilon$-accuracy (Remark \ref{['rmk:total_iteration_complexity']}).
  • Figure 2: Scatter plot of the noiseless data generated from the SDE \ref{['eqn: SDE']} (upper left), noisy observations (upper right), initialization of the CKLGD algorithm and the baseline MFLD algorithm (lower right), and the estimated marginal distributions derived by applying the CKLGD algorithm (lower left).
  • Figure 3: Reduced objective functional value $\mathcal{F}_{N, m}(\rho) - \mathcal{F}_{N,m}(\widehat{\rho})$ in the log scale versus the total number of iterations. The experiment is conducted five times independently with different observations and initializations. The MFLD algorithm exhibits a slower decay rate of reduced objective functional values (solid lines) compared to our CKLGD algorithm (dotted lines), which can be attributed to the presence of the annealing term.

Theorems & Definitions (50)

  • Theorem 1: Statistical rate of convergence: fixed design
  • Remark 1: Choice of $\tau$
  • Remark 2: Extreme case $m=1$: connection with unregularized NPMLE
  • Remark 3: Extreme case $N=1$
  • Proposition 2: Connection between E-NPMLE and minimum entropy estimator
  • Theorem 3: Statistical rate of convergence: density flow map
  • Remark 4: Time discretization error
  • Remark 5: Optimality of the rate
  • Definition 1: Linear convexity
  • Proposition 4
  • ...and 40 more