Table of Contents
Fetching ...

Conformal Load Prediction with Transductive Graph Autoencoders

Rui Luo, Nicolo Colombo

TL;DR

This work tackles edge-weight prediction on graphs with finite-sample validity by applying split conformal prediction to GNN-based edge predictors. It develops two transductive pipelines—a Graph Autoencoder (GAE) and a Line Graph Neural Network (LGNN)—and augments them with Conformalized Quantile Regression (CQR) and an Error Reweighted Conformal (ERC) variant to handle heteroscedasticity, yielding prediction intervals $C_{ab}$ with $P(W_{ab} \in C_{ab}) \geq 1-\alpha$. The authors establish exchangeability-based validity for the graph setting and show that CQR-ERC produces locally adaptive, efficient intervals. Empirical results on Chicago and Anaheim transportation networks demonstrate superior coverage and interval efficiency compared to baselines, highlighting the approach’s practical impact for uncertainty-aware traffic load forecasting and downstream optimization.

Abstract

Predicting edge weights on graphs has various applications, from transportation systems to social networks. This paper describes a Graph Neural Network (GNN) approach for edge weight prediction with guaranteed coverage. We leverage conformal prediction to calibrate the GNN outputs and produce valid prediction intervals. We handle data heteroscedasticity through error reweighting and Conformalized Quantile Regression (CQR). We compare the performance of our method against baseline techniques on real-world transportation datasets. Our approach has better coverage and efficiency than all baselines and showcases robustness and adaptability.

Conformal Load Prediction with Transductive Graph Autoencoders

TL;DR

This work tackles edge-weight prediction on graphs with finite-sample validity by applying split conformal prediction to GNN-based edge predictors. It develops two transductive pipelines—a Graph Autoencoder (GAE) and a Line Graph Neural Network (LGNN)—and augments them with Conformalized Quantile Regression (CQR) and an Error Reweighted Conformal (ERC) variant to handle heteroscedasticity, yielding prediction intervals with . The authors establish exchangeability-based validity for the graph setting and show that CQR-ERC produces locally adaptive, efficient intervals. Empirical results on Chicago and Anaheim transportation networks demonstrate superior coverage and interval efficiency compared to baselines, highlighting the approach’s practical impact for uncertainty-aware traffic load forecasting and downstream optimization.

Abstract

Predicting edge weights on graphs has various applications, from transportation systems to social networks. This paper describes a Graph Neural Network (GNN) approach for edge weight prediction with guaranteed coverage. We leverage conformal prediction to calibrate the GNN outputs and produce valid prediction intervals. We handle data heteroscedasticity through error reweighting and Conformalized Quantile Regression (CQR). We compare the performance of our method against baseline techniques on real-world transportation datasets. Our approach has better coverage and efficiency than all baselines and showcases robustness and adaptability.
Paper Structure (15 sections, 1 theorem, 27 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 1 theorem, 27 equations, 3 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

The prediction intervals generated by split CP (Algorithm alg: split CP), CQR (Algorithm alg: CQR), and ERC (Section subsec: ERC), are marginally valid, i.e. obey (eq: desired coverage).

Figures (3)

  • Figure 1: Training settings for edge weight prediction in a conventional data split. Different colors indicate the availability of the nodes during training, calibration or testing. Solid and dashed lines represent edges used for training and edges within the test and calibration set. Predicting (1) corresponds to the transductive setting considered here. (2) and (3) are examples of the inductive setting. In road traffic forecasting, (1) may be the undetected traffic flow between two existing road junctions, e.g. for a (new) road where a traffic detector has not yet been installed. (2) and (3) represent scenarios where new road junctions are constructed, connecting to existing ones or forming connections with each other to create new roads.
  • Figure 2: The figure demonstrates the application of our proposed prediction models, which provide a coverage guarantee, using a snapshot of road network and traffic flow data from Chicago, IL, United States bar2021transportation. The road network is divided into training roads (represented by black solid lines) and test roads (represented by red dashed lines). Our CQR-GAE model (Algorithm \ref{['alg: CQR']}) is developed to generate a prediction interval with a user-specified error rate of $\alpha=0.05$. The middle plot displays the predicted edge weights $\hat{W}$, where the line thickness increases proportionally with the predicted edge weights. The right plot illustrates the lengths of the prediction intervals, with darker lines indicating wider intervals or higher inefficiency (\ref{['eq: ineff']}).
  • Figure 3: The prediction interval generated by both CP (Algorithm \ref{['alg: split CP']}) and CQR (Algorithm \ref{['alg: CQR']}), which employ either GAE or LGNN as the GNN model, is constructed with a user-specified error rate of $\alpha=0.05$. All models successfully meet the coverage condition as their coverage (\ref{['eq: cover']}) surpasses $1-\alpha$. Notably, both CQR-GAE and CQR-LGNN outperform their CP counterparts in terms of inefficiency (\ref{['eq: ineff']}) and conditional coverage (\ref{['eq: cond cover']}). Among these, CQR-GAE achieves the best inefficiency.

Theorems & Definitions (2)

  • Proposition 1
  • proof