Table of Contents
Fetching ...

Similarity-Navigated Conformal Prediction for Graph Neural Networks

Jianqing Song, Jianguo Huang, Wenyu Jiang, Baoming Zhang, Shuangjie Li, Chongjun Wang

TL;DR

This work tackles the lack of reliable uncertainty estimates for GNN-based node classification by applying conformal prediction (CP) with a novel similarity-guided score aggregation. The authors introduce SNAPS, which adaptively aggregates non-conformity scores from nodes likely to share the same label—guided by feature similarity and one-hop structure—to produce compact, valid CP prediction sets with higher singleton hit ratios. They provide exchangeability-based theoretical guarantees and demonstrate empirically that SNAPS reduces average prediction-set size while preserving coverage across ten graph datasets and even ImageNet, outperforming APS, DAPS, and CF-GNN in efficiency. The method extends naturally to image classification and offers a practical, scalable post-processing tool for reliable uncertainty quantification in graph-structured and related data. Limitations include the transductive setting and the computational cost of selecting similar-label nodes, pointing to future work on inductive extensions and more efficient similarity mechanisms.

Abstract

Graph Neural Networks have achieved remarkable accuracy in semi-supervised node classification tasks. However, these results lack reliable uncertainty estimates. Conformal prediction methods provide a theoretical guarantee for node classification tasks, ensuring that the conformal prediction set contains the ground-truth label with a desired probability (e.g., 95%). In this paper, we empirically show that for each node, aggregating the non-conformity scores of nodes with the same label can improve the efficiency of conformal prediction sets while maintaining valid marginal coverage. This observation motivates us to propose a novel algorithm named Similarity-Navigated Adaptive Prediction Sets (SNAPS), which aggregates the non-conformity scores based on feature similarity and structural neighborhood. The key idea behind SNAPS is that nodes with high feature similarity or direct connections tend to have the same label. By incorporating adaptive similar nodes information, SNAPS can generate compact prediction sets and increase the singleton hit ratio (correct prediction sets of size one). Moreover, we theoretically provide a finite-sample coverage guarantee of SNAPS. Extensive experiments demonstrate the superiority of SNAPS, improving the efficiency of prediction sets and singleton hit ratio while maintaining valid coverage.

Similarity-Navigated Conformal Prediction for Graph Neural Networks

TL;DR

This work tackles the lack of reliable uncertainty estimates for GNN-based node classification by applying conformal prediction (CP) with a novel similarity-guided score aggregation. The authors introduce SNAPS, which adaptively aggregates non-conformity scores from nodes likely to share the same label—guided by feature similarity and one-hop structure—to produce compact, valid CP prediction sets with higher singleton hit ratios. They provide exchangeability-based theoretical guarantees and demonstrate empirically that SNAPS reduces average prediction-set size while preserving coverage across ten graph datasets and even ImageNet, outperforming APS, DAPS, and CF-GNN in efficiency. The method extends naturally to image classification and offers a practical, scalable post-processing tool for reliable uncertainty quantification in graph-structured and related data. Limitations include the transductive setting and the computational cost of selecting similar-label nodes, pointing to future work on inductive extensions and more efficient similarity mechanisms.

Abstract

Graph Neural Networks have achieved remarkable accuracy in semi-supervised node classification tasks. However, these results lack reliable uncertainty estimates. Conformal prediction methods provide a theoretical guarantee for node classification tasks, ensuring that the conformal prediction set contains the ground-truth label with a desired probability (e.g., 95%). In this paper, we empirically show that for each node, aggregating the non-conformity scores of nodes with the same label can improve the efficiency of conformal prediction sets while maintaining valid marginal coverage. This observation motivates us to propose a novel algorithm named Similarity-Navigated Adaptive Prediction Sets (SNAPS), which aggregates the non-conformity scores based on feature similarity and structural neighborhood. The key idea behind SNAPS is that nodes with high feature similarity or direct connections tend to have the same label. By incorporating adaptive similar nodes information, SNAPS can generate compact prediction sets and increase the singleton hit ratio (correct prediction sets of size one). Moreover, we theoretically provide a finite-sample coverage guarantee of SNAPS. Extensive experiments demonstrate the superiority of SNAPS, improving the efficiency of prediction sets and singleton hit ratio while maintaining valid coverage.
Paper Structure (30 sections, 4 theorems, 18 equations, 6 figures, 13 tables, 1 algorithm)

This paper contains 30 sections, 4 theorems, 18 equations, 6 figures, 13 tables, 1 algorithm.

Key Result

Theorem 1

vovk2005/conformal_base Let calibration data and a test instance, i.e., $\{(\boldsymbol{x}_i,y_i)\}_{i=1}^{n}\cup\{(\boldsymbol{x}_{n+1},y_{n+1})\}$ be exchangeable. For any non-conformity score function $s:\mathcal{X}\times\mathcal{Y}\rightarrow \mathbb{R}$ and any significance level $\alpha\in (0,

Figures (6)

  • Figure 1: The motivation for SNAPS. (a) The trend of $\mathrm{Coverage}$ and $\mathrm{Size}$ as the number of nodes with the same label as the ego node increases. (b) The average of node feature cosine similarity between same or different labels. (c) The number statistics of nodes with the same label and with different labels as the ego node with increasing $k$ that denotes $k$-NN with feature similarity.
  • Figure 2: The overall framework of SNAPS. (1) Basic non-conformity score function. We first use basic non-conformity score functions, e.g., APS, to convert node embeddings into non-conformity scores. (2) SNAPS function. We then aggregate basic non-conformity scores of $k$-NN with feature similarity and one-hop structural neighbors to correct the non-conformity scores of nodes. (3) Conformal Prediction. Finally, we use conformal prediction to generate prediction sets, significantly reducing their size compared to the basic score functions.
  • Figure 3: The average non-conformity scores of nodes belonging to each label based on the model GCN for dataset CoraML.
  • Figure 4: Parameter analysis. The results for $\mathrm{Size}$ and $\mathrm{SH}$ on SNAPS (based on APS) for CoraML dataset with $\alpha=0.05$.
  • Figure 5: The average non-conformity scores of nodes belonging to each label based on the model GCN for dataset CiteSeer.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Proposition 1
  • Proposition 2
  • Lemma 1