Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty

Hengrui Luo; Xiaoye S. Li; Yang Liu; Marcus Noack; Ji Qiang; Mark D. Risser

Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty

Hengrui Luo, Xiaoye S. Li, Yang Liu, Marcus Noack, Ji Qiang, Mark D. Risser

Abstract

Gaussian process (GP) regression is widely used for uncertainty quantification, yet the standard formulation assumes noise-free covariates. When inputs are measured with error, this errors-in-variables (EIV) setting can lead to optimistically narrow posterior intervals and biased decisions. We study GP regression under input measurement uncertainty by representing each noisy input as a probability measure and defining covariance through Wasserstein distances between these measures. Building on this perspective, we instantiate a deterministic projected Wasserstein ARD (PWA) kernel whose one-dimensional components admit closed-form expressions and whose product structure yields a scalable, positive-definite kernel on distributions. Unlike latent-input GP models, PWA-based GPs (\PWAGPs) handle input noise without introducing unobserved covariates or Monte Carlo projections, making uncertainty quantification more transparent and robust.

Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty

Abstract

Paper Structure (29 sections, 8 theorems, 80 equations, 2 figures, 4 tables)

This paper contains 29 sections, 8 theorems, 80 equations, 2 figures, 4 tables.

Introduction
GPs with Wasserstein-type Kernels
Wasserstein and Gromov-Wasserstein distances
Wasserstein‐type covariance kernels
Model names used in the paper.
Uniform Error Bound for Wasserstein-type GP (p=1)
Experiments and Applications
Simulated distributional regression experiments
1D scenarios.
2D scenarios.
High-dimensional scenarios.
Calibration of particle accelerators
Noisy Trajectories from NOAA Drifters
Discussion
Evaluation metrics
...and 14 more sections

Key Result

Proposition 1

Consider the errors-in-variables model with $\varepsilon_X \perp \varepsilon$. Let $f(x)=c+w^\top x$ be affine with $w\neq 0$. Define the naive $(1-\alpha)$ interval that ignores input noise: Then for any fixed $X$, whenever $w^\top\Sigma_X w>0$.

Figures (2)

Figure 1: Illustration of error-in-variable regression problem. In both panels, the true function is $y=f(X)=\frac{\sin(10\pi\cdot X)}{2X}+(X-1)^{4}$, but in each case the “ true” input locations $X$ are contaminated with measurement errors (with standard deviation of 0.01 and 0.05 on the left and right, respectively). Using a GP to accurately infer the true function (the blue line) must account for the fact that the input locations $U$ are uncertain.
Figure 2: RMSE (left) and CRPS (right) for GP-RBF vs. WGP-RBF across training/testing years 1987-2022. GP-RBF degrades when trained on recent and tested on older years, indicating poor generalization under temporal shifts. WGP-RBF remains robust, consistently achieving lower errors and better probabilistic calibration.

Theorems & Definitions (15)

Proposition 1
Definition 2
Definition 3
Proposition 4
Theorem 5
Corollary 6
Corollary 7
Proposition 8
Proposition 9
proof
...and 5 more

Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty

Abstract

Wasserstein-type Gaussian Process Regressions for Input Measurement Uncertainty

Authors

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (15)