Bayesian Quadrature: Gaussian Processes for Integration

Maren Mahsereci; Toni Karvonen

Bayesian Quadrature: Gaussian Processes for Integration

Maren Mahsereci, Toni Karvonen

TL;DR

Bayesian Quadrature surveys a probabilistic approach to numerical integration by placing a Gaussian-process prior on the integrand and conditioning on function evaluations to obtain a posterior over the integral $I_P(f)$. It develops a comprehensive taxonomy along model, inference, and sampling axes, distinguishing conjugate (affine transformations) from non-conjugate, and exploring exact versus approximate inference and deterministic, random, sequential, and active node selection. The work connects Bayesian quadrature to RKHS theory and kernel interpolation, presents extensive theoretical guarantees for various kernels (Isotropic Matérn, Product Matérn, Square Exponential), and provides controlled empirical studies that illuminate how modeling, inference, and sampling choices interact. It also discusses practical challenges—kernel-embedding availability, numerical conditioning, and cubic complexity—while highlighting the method’s potential when integrand evaluations are expensive or data are scarce. Overall, the paper offers a rigorous, practically oriented framework for designing, analyzing, and applying Bayesian quadrature across domains, with a broad bibliography and actionable guidance on implementation and design choices.

Abstract

Bayesian quadrature is a probabilistic, model-based approach to numerical integration, the estimation of intractable integrals, or expectations. Although Bayesian quadrature was popularised already in the 1980s, no systematic and comprehensive treatment has been published. The purpose of this survey is to fill this gap. We review the mathematical foundations of Bayesian quadrature from different points of view; present a systematic taxonomy for classifying different Bayesian quadrature methods along the three axes of modelling, inference, and sampling; collect general theoretical guarantees; and provide a controlled numerical study that explores and illustrates the effect of different choices along the axes of the taxonomy. We also provide a realistic assessment of practical challenges and limitations to application of Bayesian quadrature methods and include an up-to-date and nearly exhaustive bibliography that covers not only machine learning and statistics literature but all areas of mathematics and engineering in which Bayesian quadrature or equivalent methods have seen use.

Bayesian Quadrature: Gaussian Processes for Integration

TL;DR

. It develops a comprehensive taxonomy along model, inference, and sampling axes, distinguishing conjugate (affine transformations) from non-conjugate, and exploring exact versus approximate inference and deterministic, random, sequential, and active node selection. The work connects Bayesian quadrature to RKHS theory and kernel interpolation, presents extensive theoretical guarantees for various kernels (Isotropic Matérn, Product Matérn, Square Exponential), and provides controlled empirical studies that illuminate how modeling, inference, and sampling choices interact. It also discusses practical challenges—kernel-embedding availability, numerical conditioning, and cubic complexity—while highlighting the method’s potential when integrand evaluations are expensive or data are scarce. Overall, the paper offers a rigorous, practically oriented framework for designing, analyzing, and applying Bayesian quadrature across domains, with a broad bibliography and actionable guidance on implementation and design choices.

Abstract

Paper Structure (73 sections, 18 theorems, 134 equations, 7 figures, 4 tables, 4 algorithms)

This paper contains 73 sections, 18 theorems, 134 equations, 7 figures, 4 tables, 4 algorithms.

Introduction
Contents
I --- \ref{['sec:bq']}: Foundations.
II --- \ref{['sec:taxonomy']}: Taxonomy.
III --- \ref{['sec:practical-issues']}: Practice.
IV --- \ref{['sec:guarantees']}: Theoretical guarantees.
V --- \ref{['sec:experiments']}: Empirical illustrations.
VI --- Comprehensive bibliography.
Notational conventions
Bayesian quadrature
Gaussian processes
Construction of Bayesian quadrature
Conjugate Bayesian quadrature (affine $\varphi$)
Non-conjugate Bayesian quadrature (non-affine $\varphi$)
Bayesian probabilistic numerical integration
...and 58 more sections

Key Result

Lemma 2.5

Suppose that $k(\cdot, {\bm{x}})$ is measureble with respect to $P$ for every ${\bm{x}} \in D$ and that $\int_D \sqrt{[b]{k({\bm{x}}, {\bm{x}})}} \dif P({\bm{x}}) < \infty$. Then

Figures (7)

Figure 1: A sketch of Bayesian quadrature: A distribution over a function $f$ gives rise to a distribution over the integral $I_P(f) = \int_D f(x) \dif P(x)$.
Figure 2: Errors when the smooth function $f_1(x) = \exp(x)$ (left) and non-smooth function $f_2(x) = \operatorname{exp}(-\lvert x - \tfrac{1}{2} \rvert )$ (right) are integrated on the interval $[0, 1]$ with Monte Carlo, the trapezoidal rule, and the Gauss--Legendre quadrature. Because $f_1$ has a Taylor series that converges fast, Gauss--Legendre reaches machine precision almost immediately as it assumes that $f_1$ resembles a polynomial. The only assumption the trapezoidal rule makes is that the area under the graph of $f_1$ can be approximated by trapezoids. Monte Carlo assumes essentially nothing and, as a result, converges very slowly. The function $f_2$ does not resemble a polynomial and so the assumptions Gauss--Legendre do not benefit it. Bayesian quadrature makes prior assumptions such as these explicit and puts them on a systematic probabilistic footing.
Figure 3: Top: Three isotropic covariance functions $k(\cdot, 0)$. Bottom: Samples from the corresponding GPs. Note how smooth covariance functions yield smooth samples.
Figure 4: Taxonomy of Bayesian quadrature methods along three main axes: Model, inference and sampling as introduced in Chapter \ref{['sec:taxonomy']}. The labels under other are listed for completeness. The labels in each axis do not follow a hierarchy and are not necessarily mutually exclusive. Section numbers are given in parentheses.
Figure 5: Three deterministic sampling designs mentioned in Section \ref{['sec:deterministic-sampling']}: Tensor grid ($d=3$), sparse grid ($d=3$), and a lattice ($d=2$).
...and 2 more figures

Theorems & Definitions (42)

Definition 2.1: Bayesian quadrature
Definition 2.2: Conjugate Bayesian quadrature
Definition 2.3: Non-conjugate Bayesian quadrature
Definition 2.4: Bayesian probabilistic numerical integration
Lemma 2.5
proof
Proposition 2.6
Proposition 2.7
Proposition 2.8
proof
...and 32 more

Bayesian Quadrature: Gaussian Processes for Integration

TL;DR

Abstract

Bayesian Quadrature: Gaussian Processes for Integration

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (42)