Knowledge Graph Error Detection with Contrastive Confidence Adaption
Xiangyu Liu, Yang Liu, Wei Hu
TL;DR
Knowledge graphs often contain errors that are hard to detect when noise closely resembles correct triplets. The authors propose CCA, a model that fuses textual descriptions and graph structure through triplet reconstruction, interactive contrastive learning, and adaptive confidence-based knowledge fusion. CCA uses a BERT-based text encoder and a Transformer-based structure encoder to reconstruct heads/tails, aligns their latent spaces with InfoNCE-based contrastive losses, and aggregates signals via a pseudo-label-driven training objective. Evaluated on FB15K-237 and WN18RR with realistic noise (random, semantically-similar, adversarial), CCA achieves state-of-the-art performance, particularly for semantically-similar and adversarial noise, demonstrating practical utility for KG cleaning and downstream tasks.
Abstract
Knowledge graphs (KGs) often contain various errors. Previous works on detecting errors in KGs mainly rely on triplet embedding from graph structure. We conduct an empirical study and find that these works struggle to discriminate noise from semantically-similar correct triplets. In this paper, we propose a KG error detection model CCA to integrate both textual and graph structural information from triplet reconstruction for better distinguishing semantics. We design interactive contrastive learning to capture the differences between textual and structural patterns. Furthermore, we construct realistic datasets with semantically-similar noise and adversarial noise. Experimental results demonstrate that CCA outperforms state-of-the-art baselines, especially in detecting semantically-similar noise and adversarial noise.
