Table of Contents
Fetching ...

SAUC: Sparsity-Aware Uncertainty Calibration for Spatiotemporal Prediction with Graph Neural Networks

Dingyi Zhuang, Yuheng Bu, Guang Wang, Shenhao Wang, Jinhua Zhao

TL;DR

SAUC tackles uncertainty quantification for sparse, high-resolution spatiotemporal data by post-hoc calibration of $NB$ outputs produced by spatiotemporal GNNs. It partitions zero and non-zero predictions and uses quantile regression to calibrate the 5th and 95th percentile prediction intervals, aligning empirical coverage with target probabilities without retraining the base model. The authors introduce ENCE-based calibration metrics for asymmetric distributions and demonstrate around a 20% reduction in calibration error for zero-valued entries on two real-world datasets, highlighting improved reliability for safety-critical predictions. The framework is model-agnostic and scalable to existing GNN architectures, offering practical uncertainty quantification improvements for high-resolution, sparse spatiotemporal forecasting.

Abstract

Quantifying uncertainty is crucial for robust and reliable predictions. However, existing spatiotemporal deep learning mostly focuses on deterministic prediction, overlooking the inherent uncertainty in such prediction. Particularly, highly-granular spatiotemporal datasets are often sparse, posing extra challenges in prediction and uncertainty quantification. To address these issues, this paper introduces a novel post-hoc Sparsity-awar Uncertainty Calibration (SAUC) framework, which calibrates uncertainty in both zero and non-zero values. To develop SAUC, we firstly modify the state-of-the-art deterministic spatiotemporal Graph Neural Networks (ST-GNNs) to probabilistic ones in the pre-calibration phase. Then we calibrate the probabilistic ST-GNNs for zero and non-zero values using quantile approaches.Through extensive experiments, we demonstrate that SAUC can effectively fit the variance of sparse data and generalize across two real-world spatiotemporal datasets at various granularities. Specifically, our empirical experiments show a 20\% reduction in calibration errors in zero entries on the sparse traffic accident and urban crime prediction. Overall, this work demonstrates the theoretical and empirical values of the SAUC framework, thus bridging a significant gap between uncertainty quantification and spatiotemporal prediction.

SAUC: Sparsity-Aware Uncertainty Calibration for Spatiotemporal Prediction with Graph Neural Networks

TL;DR

SAUC tackles uncertainty quantification for sparse, high-resolution spatiotemporal data by post-hoc calibration of outputs produced by spatiotemporal GNNs. It partitions zero and non-zero predictions and uses quantile regression to calibrate the 5th and 95th percentile prediction intervals, aligning empirical coverage with target probabilities without retraining the base model. The authors introduce ENCE-based calibration metrics for asymmetric distributions and demonstrate around a 20% reduction in calibration error for zero-valued entries on two real-world datasets, highlighting improved reliability for safety-critical predictions. The framework is model-agnostic and scalable to existing GNN architectures, offering practical uncertainty quantification improvements for high-resolution, sparse spatiotemporal forecasting.

Abstract

Quantifying uncertainty is crucial for robust and reliable predictions. However, existing spatiotemporal deep learning mostly focuses on deterministic prediction, overlooking the inherent uncertainty in such prediction. Particularly, highly-granular spatiotemporal datasets are often sparse, posing extra challenges in prediction and uncertainty quantification. To address these issues, this paper introduces a novel post-hoc Sparsity-awar Uncertainty Calibration (SAUC) framework, which calibrates uncertainty in both zero and non-zero values. To develop SAUC, we firstly modify the state-of-the-art deterministic spatiotemporal Graph Neural Networks (ST-GNNs) to probabilistic ones in the pre-calibration phase. Then we calibrate the probabilistic ST-GNNs for zero and non-zero values using quantile approaches.Through extensive experiments, we demonstrate that SAUC can effectively fit the variance of sparse data and generalize across two real-world spatiotemporal datasets at various granularities. Specifically, our empirical experiments show a 20\% reduction in calibration errors in zero entries on the sparse traffic accident and urban crime prediction. Overall, this work demonstrates the theoretical and empirical values of the SAUC framework, thus bridging a significant gap between uncertainty quantification and spatiotemporal prediction.
Paper Structure (31 sections, 13 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 31 sections, 13 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Modify accordingly. Comparative visualization of log-transformed crime counts for different temporal resolutions in the Chicago traffic crash dataset: (a) The KDE plot demonstrates the asymmetric data distribution shape in fine resolutions. (b) The CDF plot illustrates emphasizing the asymmetry. Median values are marked by vertical dashed lines, and horizontal lines quantify the data below the median, where a value of 0.5 denotes symmetry. 'Sparsity' denotes the proportion of zero-count instances within each temporal resolution.
  • Figure 2: Change the linear layer into prediction layer. Echo in the modification parts. The likelihood one line and two branches. Remove H from the figure. Framework of models and experiments. We modify existing spatiotemporal GNNs into probabilistic ones with NB distribution parameters as the outputs. Sparsity-aware uncertainty calibration are then conducted to obtain the quantile regression results and calibrated confidence interval for forecasting. These two steps are easy and practical to generalize all existing spatiotemporal GNNs.
  • Figure 3: Explore the linear correlation between MPIW and RMSE using different cases of the CCR data.
  • Figure 4: Reliability diagrams of different calibration methods applied to STGCN-NB outputs on different components of CCR_8h and CCR_1h data. Results closer to the diagonal dashed line are considered better.
  • Figure 5: Traffic crash accident distributions and calibration using CTC_8h data. All the data are averaged over the temporal dimension.