Table of Contents
Fetching ...

Modular Jump Gaussian Processes

Anna R. Flowers, Christopher T. Franck, Mickaël Binois, Chiwoo Park, Robert B. Gramacy

TL;DR

This work tackles nonstationarity and discontinuities in Gaussian process regression by modeling jump processes with a modular approach. It introduces OLAGP to adapt local neighborhood sizes dynamically and MJGP to incorporate a latent two-level structure through EM clustering and a jump feature that augments GP inputs, avoiding costly joint inference. Individually, OLAGP and the jump-feature strategy improve predictions near manifolds of discontinuity; together they yield substantial gains across synthetic and real benchmark problems. The approach offers scalable, flexible surrogate modeling for computer experiments and other applications where output levels change abruptly across the input space.

Abstract

Gaussian processes (GPs) furnish accurate nonlinear predictions with well-calibrated uncertainty. However, the typical GP setup has a built-in stationarity assumption, making it ill-suited for modeling data from processes with sudden changes, or "jumps" in the output variable. The "jump GP" (JGP) was developed for modeling data from such processes, combining local GPs and latent "level" variables under a joint inferential framework. But joint modeling can be fraught with difficulty. We aim to simplify by suggesting a more modular setup, eschewing joint inference but retaining the main JGP themes: (a) learning optimal neighborhood sizes that locally respect manifolds of discontinuity; and (b) a new cluster-based (latent) feature to capture regions of distinct output levels on both sides of the manifold. We show that each of (a) and (b) separately leads to dramatic improvements when modeling processes with jumps. In tandem (but without requiring joint inference) that benefit is compounded, as illustrated on real and synthetic benchmark examples from the recent literature.

Modular Jump Gaussian Processes

TL;DR

This work tackles nonstationarity and discontinuities in Gaussian process regression by modeling jump processes with a modular approach. It introduces OLAGP to adapt local neighborhood sizes dynamically and MJGP to incorporate a latent two-level structure through EM clustering and a jump feature that augments GP inputs, avoiding costly joint inference. Individually, OLAGP and the jump-feature strategy improve predictions near manifolds of discontinuity; together they yield substantial gains across synthetic and real benchmark problems. The approach offers scalable, flexible surrogate modeling for computer experiments and other applications where output levels change abruptly across the input space.

Abstract

Gaussian processes (GPs) furnish accurate nonlinear predictions with well-calibrated uncertainty. However, the typical GP setup has a built-in stationarity assumption, making it ill-suited for modeling data from processes with sudden changes, or "jumps" in the output variable. The "jump GP" (JGP) was developed for modeling data from such processes, combining local GPs and latent "level" variables under a joint inferential framework. But joint modeling can be fraught with difficulty. We aim to simplify by suggesting a more modular setup, eschewing joint inference but retaining the main JGP themes: (a) learning optimal neighborhood sizes that locally respect manifolds of discontinuity; and (b) a new cluster-based (latent) feature to capture regions of distinct output levels on both sides of the manifold. We show that each of (a) and (b) separately leads to dramatic improvements when modeling processes with jumps. In tandem (but without requiring joint inference) that benefit is compounded, as illustrated on real and synthetic benchmark examples from the recent literature.

Paper Structure

This paper contains 29 sections, 7 equations, 22 figures, 2 tables, 2 algorithms.

Figures (22)

  • Figure 1: All panels show true function $Y(x)$ from Eq. (\ref{['eq:1ex']}) in black. Colored lines represent the mean (solid) and 95% PI (dashed) via (top left) a global GP, (top right) LAGP, (bottom left) MJGP with global GP, and (bottom right) MJGP with LAGP. Training data not shown.
  • Figure 2: Three possible neighborhood choices for $x = (0.04,0.023)$ (the blue dot). The two colors (orange and yellow) are two levels of the response, with a manifold of discontinuity at $\sin(x_1) = x_2$. The orange and yellow dots are $X_N$ and the black dots are three different choices for $X_n(x)$.
  • Figure 3: Left: True "Phantom" with training grid in gray; Right: Histogram of the marginal response.
  • Figure 4: Left: Order of LAGP neighborhood for point $(-0.32,0.17)$; Center: SE by neighborhood size for point $(-0.32,0.17)$; Right: Optimal neighbors LAGP (OLAGP) fit. Blue dots indicate two predictive sites and the surrounding black dots are the respective optimal neighborhoods.
  • Figure 5: Distribution of OLAGP and MJGP [Section \ref{['sec:mod']}] estimated neighborhood sizes.
  • ...and 17 more figures