Table of Contents
Fetching ...

Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees

Yuqicheng Zhu, Jingcheng Wu, Yizhen Wang, Hongkuan Zhou, Jiaoyan Chen, Evgeny Kharlamov, Steffen Staab

TL;DR

This work addresses the lack of uncertainty quantification in uncertain knowledge graph embeddings (UnKGE) by introducing UnKGCP, an inductive conformal prediction framework that outputs prediction intervals with finite-sample guarantees. It develops an efficient ICP-based interval construction and a novel adaptive nonconformity measure based on entropy normalization to produce query-specific, sharp intervals. Theoretical guarantees of validity are provided, and extensive experiments on CN15k, NL27k, and PPI5k demonstrate that UnKGCP reliably captures predictive uncertainty across multiple UnKGE backbones, while adapting interval length to difficulty and achieving strong calibration efficiency. The approach holds practical significance for high-stakes KG reasoning, enabling safer decision-making under uncertainty and offering a principled way to detect distribution shifts through interval behavior.

Abstract

Uncertain knowledge graph embedding (UnKGE) methods learn vector representations that capture both structural and uncertainty information to predict scores of unseen triples. However, existing methods produce only point estimates, without quantifying predictive uncertainty-limiting their reliability in high-stakes applications where understanding confidence in predictions is crucial. To address this limitation, we propose \textsc{UnKGCP}, a framework that generates prediction intervals guaranteed to contain the true score with a user-specified level of confidence. The length of the intervals reflects the model's predictive uncertainty. \textsc{UnKGCP} builds on the conformal prediction framework but introduces a novel nonconformity measure tailored to UnKGE methods and an efficient procedure for interval construction. We provide theoretical guarantees for the intervals and empirically verify these guarantees. Extensive experiments on standard benchmarks across diverse UnKGE methods further demonstrate that the intervals are sharp and effectively capture predictive uncertainty.

Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees

TL;DR

This work addresses the lack of uncertainty quantification in uncertain knowledge graph embeddings (UnKGE) by introducing UnKGCP, an inductive conformal prediction framework that outputs prediction intervals with finite-sample guarantees. It develops an efficient ICP-based interval construction and a novel adaptive nonconformity measure based on entropy normalization to produce query-specific, sharp intervals. Theoretical guarantees of validity are provided, and extensive experiments on CN15k, NL27k, and PPI5k demonstrate that UnKGCP reliably captures predictive uncertainty across multiple UnKGE backbones, while adapting interval length to difficulty and achieving strong calibration efficiency. The approach holds practical significance for high-stakes KG reasoning, enabling safer decision-making under uncertainty and offering a principled way to detect distribution shifts through interval behavior.

Abstract

Uncertain knowledge graph embedding (UnKGE) methods learn vector representations that capture both structural and uncertainty information to predict scores of unseen triples. However, existing methods produce only point estimates, without quantifying predictive uncertainty-limiting their reliability in high-stakes applications where understanding confidence in predictions is crucial. To address this limitation, we propose \textsc{UnKGCP}, a framework that generates prediction intervals guaranteed to contain the true score with a user-specified level of confidence. The length of the intervals reflects the model's predictive uncertainty. \textsc{UnKGCP} builds on the conformal prediction framework but introduces a novel nonconformity measure tailored to UnKGE methods and an efficient procedure for interval construction. We provide theoretical guarantees for the intervals and empirically verify these guarantees. Extensive experiments on standard benchmarks across diverse UnKGE methods further demonstrate that the intervals are sharp and effectively capture predictive uncertainty.

Paper Structure

This paper contains 37 sections, 3 theorems, 48 equations, 18 figures, 4 tables.

Key Result

Theorem 1

Assume the examples in $Z$ and the test example $z_{n+1}$ are independent and identically distributed (i.i.d). For any confidence level $\alpha\in[0,1]$ and any nonconformity measure $S$, the conformal predictor $\Gamma_{ \mathrm{CP}}^\alpha$ is conservatively valid: Furthermore, if $\{s_i\}_{i=1}^n$ contains no ties, $\Gamma_{ \mathrm{CP}}^\alpha$ is also asymptotically exactly valid:

Figures (18)

  • Figure 1: Effect of the confidence level $\alpha$ on the sharpness (top) and coverage (bottom) for test triples on CN15k. Each curve represents one predictor. Red dashed lines indicate the desired coverage levels. Additional results can be found in Figures \ref{['fig:confidence_level_NL27k_pos']}--\ref{['fig:confidence_level_PPI5k_neg']} in Appendix \ref{['app: complete_results']}.
  • Figure 2: Conditionality analysis on NL27k. Each column corresponds to a different backbone model (BEUrRE, UKGE, PASSLEAF). Top: test instances are grouped into 30 bins based on prediction error, and the mean prediction interval length is computed per bin. Only intervals that cover the ground truth are included, as non-covering intervals are not expected to reflect query difficulty. Bottom: histogram of test errors is shown to illustrate their distribution. The complete results are provided in Figures \ref{['fig:conditionality_cn15k']}--\ref{['fig:conditionality_ppi5k_neg']} in Appendix \ref{['app: complete_results']}.
  • Figure 3: Effect of calibration set size on coverage and sharpness on NL27k. The top panel reports coverage and the bottom panel reports sharpness. In both plots, the lines represent mean values across 10 runs, and the shaded areas indicate standard deviation. The complete results are provided in Figures \ref{['fig:calib_size_analysis_cn15k_pos']}--\ref{['fig:calib_size_analysis_ppi5k_neg']} in Appendix \ref{['app: complete_results']}.
  • Figure 4: Effect of the confidence level $\alpha$ on the sharpness (top) and coverage (bottom) for positive test triples on NL27k. Each curve represents one predictor. Red dashed lines indicate the desired coverage levels.
  • Figure 5: Effect of the confidence level $\alpha$ on the sharpness (top) and coverage (bottom) for positive test triples on PPI5k. Each curve represents one predictor. Red dashed lines indicate the desired coverage levels.
  • ...and 13 more figures

Theorems & Definitions (6)

  • Theorem 1: vovk2005algorithmic, lei2018distribution
  • Remark 1
  • Proposition 1
  • Proposition 1
  • proof : Proof of the lower bound
  • proof : Proof of the upper bound