Table of Contents
Fetching ...

Functional Estimation of Manifold-Valued Diffusion Processes

Jacob McErlean, Hau-Tieng Wu

Abstract

Nonstationary high-dimensional time series are increasingly encountered in biomedical research as measurement technologies advance. Owing to the homeostatic nature of physiological systems, such datasets are often located on, or can be well approximated by, a low-dimensional manifold. Modeling such datasets by manifold-valued Itô diffusion processes has been shown to provide valuable insights and to guide the design of algorithms for clinical applications. In this paper, we propose Nadaraya-Watson type nonparametric estimators for the drift vector field and diffusion matrix of the process from one trajectory. Assuming a time-homogeneous stochastic differential equation on a smooth complete manifold without boundary, we show that as the sampling interval and kernel bandwidth vanish with increasing trajectory length, recurrence of the process yields asymptotic consistency and normality of the drift and diffusion estimators, as well as the associated occupation density. Analysis of the diffusion estimator further produces a tangent space estimator for dependent data, which has its own interest and is essential for drift estimation. Numerical experiments across a range of manifold configurations support the theoretical results.

Functional Estimation of Manifold-Valued Diffusion Processes

Abstract

Nonstationary high-dimensional time series are increasingly encountered in biomedical research as measurement technologies advance. Owing to the homeostatic nature of physiological systems, such datasets are often located on, or can be well approximated by, a low-dimensional manifold. Modeling such datasets by manifold-valued Itô diffusion processes has been shown to provide valuable insights and to guide the design of algorithms for clinical applications. In this paper, we propose Nadaraya-Watson type nonparametric estimators for the drift vector field and diffusion matrix of the process from one trajectory. Assuming a time-homogeneous stochastic differential equation on a smooth complete manifold without boundary, we show that as the sampling interval and kernel bandwidth vanish with increasing trajectory length, recurrence of the process yields asymptotic consistency and normality of the drift and diffusion estimators, as well as the associated occupation density. Analysis of the diffusion estimator further produces a tangent space estimator for dependent data, which has its own interest and is essential for drift estimation. Numerical experiments across a range of manifold configurations support the theoretical results.
Paper Structure (33 sections, 32 theorems, 276 equations, 16 figures, 4 tables)

This paper contains 33 sections, 32 theorems, 276 equations, 16 figures, 4 tables.

Key Result

Theorem 4.1

limit_theorems_null Suppose Assumptions manifold-ass and manifold-ass2 hold. For any Borel-measurable, positive, and $\phi_X$-integrable $f,g: M \rightarrow \mathbb{R}$ such that $0<\langle \phi_X, g \rangle_M:=\int_M g(x)\phi_X(x) <\infty$, we have for all $x \in M$. Moreover, $\phi_X$-a.s., where the exceptional set depends on $f$ and $g$.

Figures (16)

  • Figure 1: Using the occupation density for trajectory length $n = 10^8$, time-step $\Delta=10^{-2}$, and physical time $T = 10^6$ as an accurate estimate for the invariant density $\phi_X$, we compare the invariant density to the estimate $\hat{L}^{(\texttt{o})}$ based on the first $n$ data-points of the trajectory on each observed ellipsoid to measure the convergence rate of the empirical density.
  • Figure 2: From left to right: visualizations of $\hat{\mu}_E(x), \hat{\mu}^{(\texttt{o})}(x),$ and $P_x \hat{\mu}_E(x)$, where $P_x$ is the projection operator onto the tangent space $T_xM$, for base-point samples $x$ drawn uniformly from a spherical cap centered at $(1,0,0)^\top$ and observed on an ellipsoid with eccentricity $(2,1.5,1)$, shown from two viewing angles. The ground-truth drift vector is superimposed as blue arrows.
  • Figure 3: Histograms of drift estimation errors over 1000 independent SDE simulations, comparing $\hat{\mu}_E$ and $\hat{\mu}^{(\texttt{o})}$ to the true drift vector field $\mu^{(\texttt{o})}$ at $(0,0,1)^\top$.
  • Figure 4: Histograms of the errors of the estimated diffusion matrix $u_i^\top\hat{\pi}^{(\texttt{o})}u_j$ over 1000 independent SDE simulations (labeled as $(i,j)$ in the subplots) compared to the true vector fields $u_i^\top \pi^{(\texttt{o})} u_j$, for ellipsoids of varying eccentricities.
  • Figure 5: Using the occupation density for trajectory length $n = 10^8$ and $\Delta = 10^{-2}$ as an accurate estimate for the invariant density $\phi_X$, we compare the invariant density to the estimate $\hat{L}^{(\texttt{o})}$ based on the first $n$ data-points of the trajectory to measure the convergence rate of the empirical density.
  • ...and 11 more figures

Theorems & Definitions (69)

  • Remark 1
  • Definition 1: Manifold-valued diffusion model
  • Remark 2
  • Theorem 4.1: Ratio Limit Theorem
  • Definition 2
  • Theorem 4.2: General Darling-Kac Theorem
  • Theorem 4.3: Occupation density estimator
  • Theorem 4.4: Diffusion estimator
  • Remark 3
  • Theorem 4.5: Tangent space estimator
  • ...and 59 more