Certainty in Uncertainty: Reasoning over Uncertain Knowledge Graphs with Statistical Guarantees
Yuqicheng Zhu, Jingcheng Wu, Yizhen Wang, Hongkuan Zhou, Jiaoyan Chen, Evgeny Kharlamov, Steffen Staab
TL;DR
This work addresses the lack of uncertainty quantification in uncertain knowledge graph embeddings (UnKGE) by introducing UnKGCP, an inductive conformal prediction framework that outputs prediction intervals with finite-sample guarantees. It develops an efficient ICP-based interval construction and a novel adaptive nonconformity measure based on entropy normalization to produce query-specific, sharp intervals. Theoretical guarantees of validity are provided, and extensive experiments on CN15k, NL27k, and PPI5k demonstrate that UnKGCP reliably captures predictive uncertainty across multiple UnKGE backbones, while adapting interval length to difficulty and achieving strong calibration efficiency. The approach holds practical significance for high-stakes KG reasoning, enabling safer decision-making under uncertainty and offering a principled way to detect distribution shifts through interval behavior.
Abstract
Uncertain knowledge graph embedding (UnKGE) methods learn vector representations that capture both structural and uncertainty information to predict scores of unseen triples. However, existing methods produce only point estimates, without quantifying predictive uncertainty-limiting their reliability in high-stakes applications where understanding confidence in predictions is crucial. To address this limitation, we propose \textsc{UnKGCP}, a framework that generates prediction intervals guaranteed to contain the true score with a user-specified level of confidence. The length of the intervals reflects the model's predictive uncertainty. \textsc{UnKGCP} builds on the conformal prediction framework but introduces a novel nonconformity measure tailored to UnKGE methods and an efficient procedure for interval construction. We provide theoretical guarantees for the intervals and empirically verify these guarantees. Extensive experiments on standard benchmarks across diverse UnKGE methods further demonstrate that the intervals are sharp and effectively capture predictive uncertainty.
