Table of Contents
Fetching ...

Conf-GNNRec: Quantifying and Calibrating the Prediction Confidence for GNN-based Recommendation Methods

Meng Yan, Cai Xu, Xujing Wang, Ziyu Guan, Wei Zhao, Yuhang Zhou

TL;DR

The paper tackles the problem of overconfident predictions in GNN-based recommenders by formalizing a calibration objective and proposing Conf-GNNRec, a post-calibration framework. It combines a nonlinear, segmented rating calibration with a confidence-penalizing loss to align predicted confidence with observed accuracy, using a formal calibration criterion $ \mathbb{P}(\\hat{y}_{u,i}=y_{u,i} \\mid \\hat{p}_{u,i}=p) = p$. Experiments on Gowalla, Yelp2018, and Amazon-Book show consistent improvements in top-N metrics across baselines (LightGCN, KGAT, MVIN, KGCL) and a measurable reduction in the confidence-accuracy gap, validating the approach. The work contributes a practical pathway toward trustworthy GNN-based recommendations and suggests future work in Bayesian confidence modeling and extending the framework to broader architectures.

Abstract

Recommender systems based on graph neural networks perform well in tasks such as rating and ranking. However, in real-world recommendation scenarios, noise such as user misuse and malicious advertisement gradually accumulates through the message propagation mechanism. Even if existing studies mitigate their effects by reducing the noise propagation weights, the severe sparsity of the recommender system still leads to the low-weighted noisy neighbors being mistaken as meaningful information, and the prediction result obtained based on the polluted nodes is not entirely trustworthy. Therefore, it is crucial to measure the confidence of the prediction results in this highly noisy framework. Furthermore, our evaluation of the existing representative GNN-based recommendation shows that it suffers from overconfidence. Based on the above considerations, we propose a new method to quantify and calibrate the prediction confidence of GNN-based recommendations (Conf-GNNRec). Specifically, we propose a rating calibration method that dynamically adjusts excessive ratings to mitigate overconfidence based on user personalization. We also design a confidence loss function to reduce the overconfidence of negative samples and effectively improve recommendation performance. Experiments on public datasets demonstrate the validity of Conf-GNNRec in prediction confidence and recommendation performance.

Conf-GNNRec: Quantifying and Calibrating the Prediction Confidence for GNN-based Recommendation Methods

TL;DR

The paper tackles the problem of overconfident predictions in GNN-based recommenders by formalizing a calibration objective and proposing Conf-GNNRec, a post-calibration framework. It combines a nonlinear, segmented rating calibration with a confidence-penalizing loss to align predicted confidence with observed accuracy, using a formal calibration criterion . Experiments on Gowalla, Yelp2018, and Amazon-Book show consistent improvements in top-N metrics across baselines (LightGCN, KGAT, MVIN, KGCL) and a measurable reduction in the confidence-accuracy gap, validating the approach. The work contributes a practical pathway toward trustworthy GNN-based recommendations and suggests future work in Bayesian confidence modeling and extending the framework to broader architectures.

Abstract

Recommender systems based on graph neural networks perform well in tasks such as rating and ranking. However, in real-world recommendation scenarios, noise such as user misuse and malicious advertisement gradually accumulates through the message propagation mechanism. Even if existing studies mitigate their effects by reducing the noise propagation weights, the severe sparsity of the recommender system still leads to the low-weighted noisy neighbors being mistaken as meaningful information, and the prediction result obtained based on the polluted nodes is not entirely trustworthy. Therefore, it is crucial to measure the confidence of the prediction results in this highly noisy framework. Furthermore, our evaluation of the existing representative GNN-based recommendation shows that it suffers from overconfidence. Based on the above considerations, we propose a new method to quantify and calibrate the prediction confidence of GNN-based recommendations (Conf-GNNRec). Specifically, we propose a rating calibration method that dynamically adjusts excessive ratings to mitigate overconfidence based on user personalization. We also design a confidence loss function to reduce the overconfidence of negative samples and effectively improve recommendation performance. Experiments on public datasets demonstrate the validity of Conf-GNNRec in prediction confidence and recommendation performance.

Paper Structure

This paper contains 14 sections, 6 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Illustration of the propagation of noise in the GNN-based recommendation. Sparse interactions result in 2-hop and 3-hop neighbor nodes being noisy information.
  • Figure 2: Reliability diagrams for LightGCN (left) and KGCL (right). The blue bar is the output of model, the yellow bar is the output in the perfect case. Any deviation from a perfectly diagonal (i.e., yellow slashes) represents the miscalibration.
  • Figure 3: Reliability diagrams for LightGCN (left) and KGCL (right) after applying Conf-GNNRec.