Table of Contents
Fetching ...

Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings

Yuqicheng Zhu, Daniel Hernández, Yuan He, Zifeng Ding, Bo Xiong, Evgeny Kharlamov, Steffen Staab

TL;DR

This work proposes CondKGCP, a novel method that approximates predicate-conditional coverage guarantees while maintaining compact prediction sets and proves the theoretical guarantees and empirical effectiveness of CondKGCP by comprehensive evaluations.

Abstract

Uncertainty quantification in Knowledge Graph Embedding (KGE) methods is crucial for ensuring the reliability of downstream applications. A recent work applies conformal prediction to KGE methods, providing uncertainty estimates by generating a set of answers that is guaranteed to include the true answer with a predefined confidence level. However, existing methods provide probabilistic guarantees averaged over a reference set of queries and answers (marginal coverage guarantee). In high-stakes applications such as medical diagnosis, a stronger guarantee is often required: the predicted sets must provide consistent coverage per query (conditional coverage guarantee). We propose CondKGCP, a novel method that approximates predicate-conditional coverage guarantees while maintaining compact prediction sets. CondKGCP merges predicates with similar vector representations and augments calibration with rank information. We prove the theoretical guarantees and demonstrate empirical effectiveness of CondKGCP by comprehensive evaluations.

Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings

TL;DR

This work proposes CondKGCP, a novel method that approximates predicate-conditional coverage guarantees while maintaining compact prediction sets and proves the theoretical guarantees and empirical effectiveness of CondKGCP by comprehensive evaluations.

Abstract

Uncertainty quantification in Knowledge Graph Embedding (KGE) methods is crucial for ensuring the reliability of downstream applications. A recent work applies conformal prediction to KGE methods, providing uncertainty estimates by generating a set of answers that is guaranteed to include the true answer with a predefined confidence level. However, existing methods provide probabilistic guarantees averaged over a reference set of queries and answers (marginal coverage guarantee). In high-stakes applications such as medical diagnosis, a stronger guarantee is often required: the predicted sets must provide consistent coverage per query (conditional coverage guarantee). We propose CondKGCP, a novel method that approximates predicate-conditional coverage guarantees while maintaining compact prediction sets. CondKGCP merges predicates with similar vector representations and augments calibration with rank information. We prove the theoretical guarantees and demonstrate empirical effectiveness of CondKGCP by comprehensive evaluations.

Paper Structure

This paper contains 35 sections, 5 theorems, 40 equations, 6 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

Suppose the triples in $\mathcal{T}_\mathrm{tr}$, $\mathcal{T}_\mathrm{cal}$ and $\mathcal{T}_\mathrm{test}$ are drawn independent and identically distributed (i.i.d) from the underlying distribution $\mathcal{P}$. For every element $(q, e) \in \mathcal{T}_\mathrm{test}$, the probability of $e$ to b

Figures (6)

  • Figure 1: Comparison of methods across varying target coverage levels, showing CovGap (top plot) and AveSize (bottom plot) for RESCAL on WN18. Complete results are provided in Tables \ref{['fig:alpha_covgap']} and \ref{['fig:alpha_size']} in the Appendix.
  • Figure 2: Influence of hyperparameters $\phi$ and $\gamma$ on CovGap (top) and AveSize (bottom) for RESCAL on WN18. Complete results for all model-dataset combinations are provided in Tables \ref{['fig:hyper_covgap']} and \ref{['fig:hyper_size']} in the Appendix.
  • Figure 3: Complete results of comparison of methods' CovGap across varying target coverage levels.
  • Figure 4: Complete results of comparison of methods' AveSize across varying target coverage levels.
  • Figure 5: Influence of hyperparameters $\phi$ and $\gamma$ on CovGap.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Theorem 1: zhu2024conformalized
  • Proposition 1: Conditional Coverage Guarantee
  • Corollary 1: shi2024conformal
  • Proposition 1: Conditional Coverage Guarantee
  • proof : Proof of the lower bound
  • proof : Proof of the upper bound
  • Corollary 2: shi2024conformal
  • proof