Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations
Deren Lei, Yaxi Li, Mengya Hu, Mingyu Wang, Vincent Yun, Emily Ching, Eslam Kamal
TL;DR
The paper tackles ungrounded hallucinations in LLM outputs conditioned on source documents and introduces CoNLI, a two-stage, plug-and-play detection-and-edit framework that reframes hallucination detection as a chain of natural language inference tasks. It combines sentence-level and entity-level judgments to identify grounded content and then uses these detections to guide a post-editing mitigation that preserves original content while reducing hallucinations. Across synthetic and human-annotated benchmarks, CoNLI achieves strong detection performance, with GPT-4 configurations often leading, and demonstrates effective reduction of hallucinations while maintaining or improving key quality metrics. The approach is domain-agnostic, requires no model fine-tuning, and offers interpretable outputs that can be readily deployed in real-world LLM-based systems.
Abstract
Large language models (LLMs) can generate fluent natural language texts when given relevant documents as background context. This ability has attracted considerable interest in developing industry applications of LLMs. However, LLMs are prone to generate hallucinations that are not supported by the provided sources. In this paper, we propose a hierarchical framework to detect and mitigate such ungrounded hallucination. Our framework uses Chain of Natural Language Inference (CoNLI) for hallucination detection and hallucination reduction via post-editing. Our approach achieves state-of-the-art performance on hallucination detection and enhances text quality through rewrite, using LLMs without any fine-tuning or domain-specific prompt engineering. We show that this simple plug-and-play framework can serve as an effective choice for hallucination detection and reduction, achieving competitive performance across various contexts.
