Table of Contents
Fetching ...

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Yongsheng Mei, Mahdi Imani, Tian Lan

TL;DR

The paper addresses Bayesian optimization where observations arise from a latent spatio-temporal intensity governed by a Gaussian process, λ(t)=κ(g(t)). It develops a MAP inference framework for Gaussian Cox processes using a Laplace approximation and a kernel-change to an RKHS, yielding a computable posterior mean μ and covariance Σ for the latent intensity, with μ=κ^{-1}(ĝ^2) and A^{−1} providing the covariance. A Nyström-based kernel approximation enables scalable computation, and a BO framework over the estimated intensity is demonstrated with four acquisition designs (UCB, idle time, cumulative arrivals, and change-point) across synthetic and real datasets, outperforming baselines. This advances spatio-temporal BO by integrating doubly stochastic point-process surrogates with flexible link functions, delivering improved intensity estimation and efficient optimization in practical settings.

Abstract

Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model.

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

TL;DR

The paper addresses Bayesian optimization where observations arise from a latent spatio-temporal intensity governed by a Gaussian process, λ(t)=κ(g(t)). It develops a MAP inference framework for Gaussian Cox processes using a Laplace approximation and a kernel-change to an RKHS, yielding a computable posterior mean μ and covariance Σ for the latent intensity, with μ=κ^{-1}(ĝ^2) and A^{−1} providing the covariance. A Nyström-based kernel approximation enables scalable computation, and a BO framework over the estimated intensity is demonstrated with four acquisition designs (UCB, idle time, cumulative arrivals, and change-point) across synthetic and real datasets, outperforming baselines. This advances spatio-temporal BO by integrating doubly stochastic point-process surrogates with flexible link functions, delivering improved intensity estimation and efficient optimization in practical settings.

Abstract

Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model.
Paper Structure (35 sections, 7 theorems, 35 equations, 11 figures, 5 tables, 2 algorithms)

This paper contains 35 sections, 7 theorems, 35 equations, 11 figures, 5 tables, 2 algorithms.

Key Result

Lemma 1

For the $d$-dimensional multivariate distribution regarding ${\bm{t}}\in\mathbb{R}^d$, given a mode $\hat{{\bm{g}}}$ such that $\hat{{\bm{g}}}=\mathop{\mathrm{arg\,max}}\limits_{\bm{g}} \log f({\bm{g}})$, the likelihood (eq:likelihood) can be approximated as: where ${\bm{A}} \triangleq -\nabla^2_{{\bm{g}}=\hat{{\bm{g}}}}\log f(\hat{{\bm{g}}})$ is the Hessian matrix.

Figures (11)

  • Figure 1: Mean estimations comparison on three types of synthetic data.
  • Figure 2: Step-wise visualization of BO on synthetic intensity function.
  • Figure 3: Mean estimations comparison on two types of 2D real-world data.
  • Figure 4: Step-wise visualization of BO on 2022 DC crime incidents data.
  • Figure 5: BO with $a_\mathrm{idle}$ and $a_\mathrm{CPD}$ on coal mining disaster data.
  • ...and 6 more figures

Theorems & Definitions (12)

  • Lemma 1: Laplace approximation
  • Lemma 2: Kernel transformation
  • Theorem 1: Posterior mean
  • Theorem 2: Posterior covariance
  • Lemma 3: Nyström approximation
  • Proposition 1
  • Lemma 4: Bayesian online change point probability
  • proof
  • proof
  • proof
  • ...and 2 more