Table of Contents
Fetching ...

Conditional Shift-Robust Conformal Prediction for Graph Neural Network

S. Akansha

TL;DR

Graph Neural Networks struggle to provide reliable uncertainty under conditional shifts between training and test data. We introduce Conditional Shift Robust Conformal Prediction (CondSR CP), a model-agnostic framework that combines conformal prediction with a dual regularizer (CMD and MMD) to align latent representations across biased training and IID test distributions, ensuring valid $P(y\in C(x)) \ge 1-\epsilon$ while producing tighter prediction sets. Empirically, CondSR yields up to 12% accuracy gains and up to 48% smaller prediction sets under conditional shift on standard graph benchmarks, without sacrificing coverage. This work advances uncertainty quantification in graph learning for real-world, shift-prone scenarios by providing a practical, generalizable approach that integrates with diverse GNN architectures.

Abstract

Graph Neural Networks (GNNs) have emerged as potent tools for predicting outcomes in graph-structured data. Despite their efficacy, a significant drawback of GNNs lies in their limited ability to provide robust uncertainty estimates, posing challenges to their reliability in contexts where errors carry significant consequences. Moreover, GNNs typically excel in in-distribution settings, assuming that training and test data follow identical distributions a condition often unmet in real world graph data scenarios. In this article, we leverage conformal prediction, a widely recognized statistical technique for quantifying uncertainty by transforming predictive model outputs into prediction sets, to address uncertainty quantification in GNN predictions amidst conditional shift\footnote{Representing the change in conditional probability distribution \(P(label|input)\) from source domain to target domain.} in graph-based semi-supervised learning (SSL). Additionally, we propose a novel loss function aimed at refining model predictions by minimizing conditional shift in latent stages. Termed Conditional Shift Robust (CondSR) conformal prediction for GNNs, our approach CondSR is model-agnostic and adaptable to various classification models. We validate the effectiveness of our method on standard graph benchmark datasets, integrating it with state-of-the-art GNNs in node classification tasks. Comprehensive evaluations demonstrate that our approach consistently achieves any predefined target marginal coverage, enhances the accuracy of state of the art GNN models by up to 12\% under conditional shift, and reduces the prediction set size by up to 48\%. The code implementation is publicly available for further exploration and experimentation.

Conditional Shift-Robust Conformal Prediction for Graph Neural Network

TL;DR

Graph Neural Networks struggle to provide reliable uncertainty under conditional shifts between training and test data. We introduce Conditional Shift Robust Conformal Prediction (CondSR CP), a model-agnostic framework that combines conformal prediction with a dual regularizer (CMD and MMD) to align latent representations across biased training and IID test distributions, ensuring valid while producing tighter prediction sets. Empirically, CondSR yields up to 12% accuracy gains and up to 48% smaller prediction sets under conditional shift on standard graph benchmarks, without sacrificing coverage. This work advances uncertainty quantification in graph learning for real-world, shift-prone scenarios by providing a practical, generalizable approach that integrates with diverse GNN architectures.

Abstract

Graph Neural Networks (GNNs) have emerged as potent tools for predicting outcomes in graph-structured data. Despite their efficacy, a significant drawback of GNNs lies in their limited ability to provide robust uncertainty estimates, posing challenges to their reliability in contexts where errors carry significant consequences. Moreover, GNNs typically excel in in-distribution settings, assuming that training and test data follow identical distributions a condition often unmet in real world graph data scenarios. In this article, we leverage conformal prediction, a widely recognized statistical technique for quantifying uncertainty by transforming predictive model outputs into prediction sets, to address uncertainty quantification in GNN predictions amidst conditional shift\footnote{Representing the change in conditional probability distribution \(P(label|input)\) from source domain to target domain.} in graph-based semi-supervised learning (SSL). Additionally, we propose a novel loss function aimed at refining model predictions by minimizing conditional shift in latent stages. Termed Conditional Shift Robust (CondSR) conformal prediction for GNNs, our approach CondSR is model-agnostic and adaptable to various classification models. We validate the effectiveness of our method on standard graph benchmark datasets, integrating it with state-of-the-art GNNs in node classification tasks. Comprehensive evaluations demonstrate that our approach consistently achieves any predefined target marginal coverage, enhances the accuracy of state of the art GNN models by up to 12\% under conditional shift, and reduces the prediction set size by up to 48\%. The code implementation is publicly available for further exploration and experimentation.
Paper Structure (11 sections, 15 equations, 3 figures, 4 tables)

This paper contains 11 sections, 15 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Performance comparison of APPNP, CFGNN, and ConSR under conditional data shift. We evaluate accuracy (left plot in each column), marginal coverage (middle plot in each column), and prediction set size (right plot in each column) across different biasing settings ($\alpha = 0.1, 0.4, 0.6$) on the Cora dataset. As $\alpha$ increases, indicating that the training data distribution is becoming more similar to the test data distribution, both APPNP and ConSR show increased accuracy and decreased prediction set size while maintaining a marginal coverage of 90%. However, CFGNN does not exhibit a clear pattern in terms of accuracy and prediction set size. ConSR consistently achieves higher accuracy and reduced prediction set size across all scenarios.
  • Figure 2: The first three plots in each row show the impact of the biasing parameter $\alpha$ on accuracy (left plot), coverage (middle plot), and size of the prediction set (right plot) under conditional shift in the data. The last three plots in each row show the impact of CondSR on accuracy (left plot), coverage (middle plot), and size of the prediction set (right plot) under the same values of the biasing parameter $\alpha$.
  • Figure 3: The plots in the first and second rows illustrate the effect of varying $\lambda_{\text{MMD}} = 0.1, 0.3, 0.5, 0.7, 1$ on accuracy for the Cora and Citeseer datasets, respectively. For each fixed value of $\lambda_{\text{MMD}}$, $\lambda_{\text{CMD}}$ is varied as $0.1, 0.3, 0.5, 0.7, 1$. The maximum accuracy is observed at $\lambda_{\text{CMD}} = 0.5$ and $\lambda_{\text{MMD}} = 1$ for the Cora dataset, and at $\lambda_{\text{CMD}} = 0.1$ and $\lambda_{\text{MMD}} = 0.5$ for the Citeseer dataset.