Conformalized Gaussian processes for online uncertainty quantification over graphs
Jinwen Xu, Qin Lu, Georgios B. Giannakis
TL;DR
This work tackles uncertainty quantification on streaming graphs by integrating graph-aware Gaussian processes with online conformal prediction to guarantee coverage under distribution shifts. It introduces a scalable RF-based graph GP framework and an ensemble of kernels whose weights adapt online, then couples them with online conformal prediction to achieve valid, adaptive prediction sets. The approach leverages random Fourier features to approximate graph kernels and employs a per-model Bayesian update, with online CP adjusting thresholds to maintain coverage as data evolve. Empirical results on synthetic and real-world graphs demonstrate near-target coverage with efficient, robust prediction sets, outperforming static calibration baselines under non-stationarity.
Abstract
Uncertainty quantification (UQ) over graphs arises in a number of safety-critical applications in network science. The Gaussian process (GP), as a classical Bayesian framework for UQ, has been developed to handle graph-structured data by devising topology-aware kernel functions. However, such GP-based approaches are limited not only by the prohibitive computational complexity, but also the strict modeling assumptions that might yield poor coverage, especially with labels arriving on the fly. To effect scalability, we devise a novel graph-aware parametric GP model by leveraging the random feature (RF)-based kernel approximation, which is amenable to efficient recursive Bayesian model updates. To further allow for adaptivity, an ensemble of graph-aware RF-based scalable GPs have been leveraged, with per-GP weight adapted to data arriving incrementally. To ensure valid coverage with robustness to model mis-specification, we wed the GP-based set predictors with the online conformal prediction framework, which post-processes the prediction sets using adaptive thresholds. Experimental results the proposed method yields improved coverage and efficient prediction sets over existing baselines by adaptively ensembling the GP models and setting the key threshold parameters in CP.
