Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Yongsheng Mei; Mahdi Imani; Tian Lan

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Yongsheng Mei, Mahdi Imani, Tian Lan

TL;DR

The paper addresses Bayesian optimization where observations arise from a latent spatio-temporal intensity governed by a Gaussian process, λ(t)=κ(g(t)). It develops a MAP inference framework for Gaussian Cox processes using a Laplace approximation and a kernel-change to an RKHS, yielding a computable posterior mean μ and covariance Σ for the latent intensity, with μ=κ^{-1}(ĝ^2) and A^{−1} providing the covariance. A Nyström-based kernel approximation enables scalable computation, and a BO framework over the estimated intensity is demonstrated with four acquisition designs (UCB, idle time, cumulative arrivals, and change-point) across synthetic and real datasets, outperforming baselines. This advances spatio-temporal BO by integrating doubly stochastic point-process surrogates with flexible link functions, delivering improved intensity estimation and efficient optimization in practical settings.

Abstract

Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model.

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

TL;DR

Abstract

Paper Structure (35 sections, 7 theorems, 35 equations, 11 figures, 5 tables, 2 algorithms)

This paper contains 35 sections, 7 theorems, 35 equations, 11 figures, 5 tables, 2 algorithms.

Introduction
Related Work
Gaussian Cox Process Models
Bayesian Optimization through Acquisition Functions
Methodology
Gaussian Cox Process Model
Estimating Posterior Mean and Covariance
Numerical Kernel Approximation
Bayesian Optimization over Estimated Intensity
Experiments
Evaluation using Synthetic Data
Estimation of Synthetic Intensity Functions
Bayesian Optimization over Synthetic Intensity
Evaluation using Real-world Data
Intensity Estimation on 2D Spatial Data
...and 20 more sections

Key Result

Lemma 1

For the $d$-dimensional multivariate distribution regarding ${\bm{t}}\in\mathbb{R}^d$, given a mode $\hat{{\bm{g}}}$ such that $\hat{{\bm{g}}}=\mathop{\mathrm{arg\,max}}\limits_{\bm{g}} \log f({\bm{g}})$, the likelihood (eq:likelihood) can be approximated as: where ${\bm{A}} \triangleq -\nabla^2_{{\bm{g}}=\hat{{\bm{g}}}}\log f(\hat{{\bm{g}}})$ is the Hessian matrix.

Figures (11)

Figure 1: Mean estimations comparison on three types of synthetic data.
Figure 2: Step-wise visualization of BO on synthetic intensity function.
Figure 3: Mean estimations comparison on two types of 2D real-world data.
Figure 4: Step-wise visualization of BO on 2022 DC crime incidents data.
Figure 5: BO with $a_\mathrm{idle}$ and $a_\mathrm{CPD}$ on coal mining disaster data.
...and 6 more figures

Theorems & Definitions (12)

Lemma 1: Laplace approximation
Lemma 2: Kernel transformation
Theorem 1: Posterior mean
Theorem 2: Posterior covariance
Lemma 3: Nyström approximation
Proposition 1
Lemma 4: Bayesian online change point probability
proof
proof
proof
...and 2 more

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

TL;DR

Abstract

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (12)