Table of Contents
Fetching ...

Neural Diffusion Intensity Models for Point Process Data

Xinlong Du, Harsha Honnappa, Vinayak Rao

TL;DR

This work designs an amortized encoder architecture that maps variable-length event sequences to posterior intensity paths by simulating the drift-corrected SDE, replacing repeated MCMC runs with a single forward pass, and shows accurate recovery of latent intensity dynamics and posterior paths.

Abstract

Cox processes model overdispersed point process data via a latent stochastic intensity, but both nonparametric estimation of the intensity model and posterior inference over intensity paths are typically intractable, relying on expensive MCMC methods. We introduce Neural Diffusion Intensity Models, a variational framework for Cox processes driven by neural SDEs. Our key theoretical result, based on enlargement of filtrations, shows that conditioning on point process observations preserves the diffusion structure of the latent intensity with an explicit drift correction. This guarantees the variational family contains the true posterior, so that ELBO maximization coincides with maximum likelihood estimation under sufficient model capacity. We design an amortized encoder architecture that maps variable-length event sequences to posterior intensity paths by simulating the drift-corrected SDE, replacing repeated MCMC runs with a single forward pass. Experiments on synthetic and real-world data demonstrate accurate recovery of latent intensity dynamics and posterior paths, with orders-of-magnitude speedups over MCMC-based methods.

Neural Diffusion Intensity Models for Point Process Data

TL;DR

This work designs an amortized encoder architecture that maps variable-length event sequences to posterior intensity paths by simulating the drift-corrected SDE, replacing repeated MCMC runs with a single forward pass, and shows accurate recovery of latent intensity dynamics and posterior paths.

Abstract

Cox processes model overdispersed point process data via a latent stochastic intensity, but both nonparametric estimation of the intensity model and posterior inference over intensity paths are typically intractable, relying on expensive MCMC methods. We introduce Neural Diffusion Intensity Models, a variational framework for Cox processes driven by neural SDEs. Our key theoretical result, based on enlargement of filtrations, shows that conditioning on point process observations preserves the diffusion structure of the latent intensity with an explicit drift correction. This guarantees the variational family contains the true posterior, so that ELBO maximization coincides with maximum likelihood estimation under sufficient model capacity. We design an amortized encoder architecture that maps variable-length event sequences to posterior intensity paths by simulating the drift-corrected SDE, replacing repeated MCMC runs with a single forward pass. Experiments on synthetic and real-world data demonstrate accurate recovery of latent intensity dynamics and posterior paths, with orders-of-magnitude speedups over MCMC-based methods.
Paper Structure (41 sections, 6 theorems, 63 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 41 sections, 6 theorems, 63 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Theorem 2.1

Fix $T' \leq T$ and let $N_{0:T'}$ be a one-dimensional point process on the interval $[0,T']$, and suppose the prior intensity process satisfies Then, conditioned on the observed event times $X=(0<\tau_1<\cdots<\tau_{N_{T'}}\le T)$, the posterior intensity process admits the SDE representation where $\tilde{B}$ is a Brownian motion with respect to $\mathcal{G}_t$ and with the conditional expec

Figures (8)

  • Figure 1: The variance curve exceeds the mean curve, indicating strong overdispersion of the arrival point process, in the US Bank data from seelab2024data.
  • Figure 2: Comparing learned prior drift $b_\theta(Z_t,t)$ (purple) to the ground truth drift $\tilde{b}(Z_t,t)=0.3(80-Z_t)$ (yellow).
  • Figure 3: The learned data samples (red) compared to the true data samples (blue).
  • Figure 4: The learned posterior sample paths (red) are generated using the learned $(b_\theta,u_\beta)$ pair, and the baseline "true" posterior sample paths are generated using extensive MCMC simulations (blue).
  • Figure 5: Train vs. test Wasserstein distance between amortized posterior samples and high-fidelity MCMC posterior samples as a function of the training sample size $n$. A train-test gap indicates overfitting of the amortized correction $u_\beta(\cdots, X)$, which vanishes as $n$ goes beyond $16$.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Theorem 2.1
  • Theorem 3.1
  • Theorem B.1: Jacod's condition
  • Lemma B.2
  • Theorem B.3
  • Theorem D.1: Girsanov's theorem