Hallucination-Resistant Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

Yupei Yang; Fan Feng; Lin Yang; Wanxi Deng; Lin Qu; Biwei Huang; Shikui Tu; Lei Xu

Hallucination-Resistant Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

Yupei Yang, Fan Feng, Lin Yang, Wanxi Deng, Lin Qu, Biwei Huang, Shikui Tu, Lei Xu

TL;DR

This paper tackles hallucination in large-language-model–based relation extraction by introducing DEPTH, a two-tiered framework that combines dependency-aware sentence simplification for per-pair grounding with a global Refinement stage to ensure sentence-wide consistency. A causal reward modeling approach is proposed to mitigate reward hacking in RLHF, enabling robust PPO-based fine-tuning. Empirical results across eight diverse benchmarks show DEPTH consistently reduces NO-RELATION hallucinations and yields substantial improvements in micro-F1, with strong cross-dataset transferability. Overall, DEPTH offers a practical, scalable solution for reliable, domain-general relation extraction in enterprise contexts.

Abstract

Relation extraction (RE) enables the construction of structured knowledge for many downstream applications. While large language models (LLMs) have shown great promise in this task, they often struggle to reliably determine whether a relation exists, particularly in sentences with complex syntax or subtle semantics. For instance, we find that Qwen2.5-14B-Instruct incorrectly predicts a relation in 96.9% of NO-RELATION instances on SciERC, revealing a severe hallucination problem. To address these challenges, we propose DEPTH, a framework that integrates Dependency-aware sEntence simPlification and Two-tiered Hierarchical refinement into the relation extraction pipeline. Given a sentence and its candidate entity pairs, DEPTH operates in two stages: (1) the Grounding module extracts relations for each pair by leveraging their shortest dependency path, distilling the sentence into a minimal yet coherent relational context that reduces syntactic noise while preserving key semantics; (2) the Refinement module aggregates all local predictions and revises them based on a holistic understanding of the sentence, correcting omissions and inconsistencies. We further introduce a causality-driven reward model that mitigates reward hacking by disentangling spurious correlations, enabling robust fine-tuning via reinforcement learning with human feedback. Experiments on eight well-established benchmarks demonstrate that DEPTH reduces the average hallucination rate to 7.9% while achieving a 9.3% improvement in average F1 score over existing LLM-based extraction baselines.

Hallucination-Resistant Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

TL;DR

Abstract

Hallucination-Resistant Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)

Theorems & Definitions (1)