Table of Contents
Fetching ...

A Nonparametric Discrete Hawkes Model with a Collapsed Gaussian-Process Prior

Trinnhallen Brisley, Gordon Ross, Daniel Paulin

TL;DR

This work introduces GP-DHP, a scalable nonparametric model for discrete-time self-exciting count data that places Gaussian-process priors on both the baseline $b(t)$ and the excitation kernel $f(d)$. By collapsing these priors into a single latent Gaussian process over the additive intensity $\boldsymbol{\ell}$, the method enables efficient MAP inference with complexity $O(T\log T)$ and yields a closed-form projection to interpretable baseline and excitation components. Through synthetic experiments and two real-data case studies (U.S. terrorism and weekly Cryptosporidiosis counts), GP-DHP demonstrates accurate recovery of latent dynamics and improved predictive log-likelihood over standard parametric baselines, while maintaining interpretability via the decomposed components and a practical stability diagnostic. The approach advances discrete-time Hawkes modeling by offering flexible, data-adaptive baselines and excitations without compromising scalability, and it includes an open-source implementation for replication and reuse.

Abstract

Hawkes process models are used in settings where past events increase the likelihood of future events occurring. Many applications record events as counts on a regular grid, yet discrete-time Hawkes models remain comparatively underused and are often constrained by fixed-form baselines and excitation kernels. In particular, there is a lack of flexible, nonparametric treatments of both the baseline and the excitation in discrete time. To this end, we propose the Gaussian Process Discrete Hawkes Process (GP-DHP), a nonparametric framework that places Gaussian process priors on both the baseline and the excitation and performs inference through a collapsed latent representation. This yields smooth, data-adaptive structure without prespecifying trends, periodicities, or decay shapes, and enables maximum a posteriori (MAP) estimation with near-linear-time \(O(T\log T)\) complexity. A closed-form projection recovers interpretable baseline and excitation functions from the optimized latent trajectory. In simulations, GP-DHP recovers diverse excitation shapes and evolving baselines. In case studies on U.S. terrorism incidents and weekly Cryptosporidiosis counts, it improves test predictive log-likelihood over standard parametric discrete Hawkes baselines while capturing bursts, delays, and seasonal background variation. The results indicate that flexible discrete-time self-excitation can be achieved without sacrificing scalability or interpretability.

A Nonparametric Discrete Hawkes Model with a Collapsed Gaussian-Process Prior

TL;DR

This work introduces GP-DHP, a scalable nonparametric model for discrete-time self-exciting count data that places Gaussian-process priors on both the baseline and the excitation kernel . By collapsing these priors into a single latent Gaussian process over the additive intensity , the method enables efficient MAP inference with complexity and yields a closed-form projection to interpretable baseline and excitation components. Through synthetic experiments and two real-data case studies (U.S. terrorism and weekly Cryptosporidiosis counts), GP-DHP demonstrates accurate recovery of latent dynamics and improved predictive log-likelihood over standard parametric baselines, while maintaining interpretability via the decomposed components and a practical stability diagnostic. The approach advances discrete-time Hawkes modeling by offering flexible, data-adaptive baselines and excitations without compromising scalability, and it includes an open-source implementation for replication and reuse.

Abstract

Hawkes process models are used in settings where past events increase the likelihood of future events occurring. Many applications record events as counts on a regular grid, yet discrete-time Hawkes models remain comparatively underused and are often constrained by fixed-form baselines and excitation kernels. In particular, there is a lack of flexible, nonparametric treatments of both the baseline and the excitation in discrete time. To this end, we propose the Gaussian Process Discrete Hawkes Process (GP-DHP), a nonparametric framework that places Gaussian process priors on both the baseline and the excitation and performs inference through a collapsed latent representation. This yields smooth, data-adaptive structure without prespecifying trends, periodicities, or decay shapes, and enables maximum a posteriori (MAP) estimation with near-linear-time \(O(T\log T)\) complexity. A closed-form projection recovers interpretable baseline and excitation functions from the optimized latent trajectory. In simulations, GP-DHP recovers diverse excitation shapes and evolving baselines. In case studies on U.S. terrorism incidents and weekly Cryptosporidiosis counts, it improves test predictive log-likelihood over standard parametric discrete Hawkes baselines while capturing bursts, delays, and seasonal background variation. The results indicate that flexible discrete-time self-excitation can be achieved without sacrificing scalability or interpretability.

Paper Structure

This paper contains 30 sections, 1 theorem, 46 equations, 6 figures, 3 tables.

Key Result

Proposition 3.1

Let $K_b\in\mathbb{R}^{T\times T}$ and $K_f\in\mathbb{R}^{(T-1)\times(T-1)}$ be symmetric positive definite, let $X\in\mathbb{R}^{T\times (T-1)}$, and let $\ell^*\in\mathbb{R}^T$. Consider Define $K:=K_b + X K_f X^\top\in\mathbb{R}^{T\times T}$. Then:

Figures (6)

  • Figure 1: Discrete-time Hawkes intensity on $[0,20]$ with geometric excitation. The plot shows intensity (black step), baseline (red dashed), and event times (black dotted; multiplicities annotated at the top). Parameters: baseline $\mu=0.5$, jump size $K=0.75$, geometric decay $\beta=0.5$; events at $t=3,7,10$ with multiplicities $2,4,3$, respectively.
  • Figure 2: Draws of the GP prior over the excitation function $f$: the effect of increasing $\beta$ at fixed $\ell_f$. Top: draws of $f(d)$; Bottom: corresponding heatmaps for the covariance matrices $K_f$.
  • Figure 3: Fitted excitation functions $\hat{f}(d)$ for twelve synthetic scenarios, grouped by kernel family. Rows correspond to: (1) Negative Binomial with increasing shape $r$ at fixed $\alpha$ and $p$; (2) Geometric with varying decay parameter $p$ at fixed $\alpha$; (3) Power law $f(d)=\alpha(\gamma+d)^{-\beta}$ with fixed $\beta=4$ and increasing width via $\gamma$ (with corresponding changes in $\alpha$); and (4) Bimodal Gaussian mixtures with fixed $\mu_1$ and $\sigma$ and increasing separation via $\mu_2$. All datasets share the same baseline $b(t)$.
  • Figure 4: Recovery of baseline and excitation components across three distinct baseline settings. Top panel: Comparison of true (solid blue) and estimated (dashed red) baseline functions for each experiment. The functional forms correspond to constant, linear, and periodic baselines, respectively. Bottom panel: Estimated excitation kernels (colored dashed lines) overlaid on the true excitation kernel (black solid). Despite varying baselines, the recovered excitation functions are consistent, confirming decomposition stability.
  • Figure 5: Estimated baseline and excitation functions for the GP-DHP fit to U.S. terrorism data, including a close-up view of the seasonality over one year (daily aggregation).
  • ...and 1 more figures

Theorems & Definitions (2)

  • Proposition 3.1: Hard-constraint decomposition: existence, uniqueness, and closed form solution
  • proof