SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection
Mengya Hu, Rui Xu, Deren Lei, Yaxi Li, Mingyu Wang, Emily Ching, Eslam Kamal, Alex Deng
TL;DR
This work tackles real-time hallucination detection by combining a small language model (SLM) for fast initial judgments with a constrained, large language model (LLM) to generate explanations. It introduces a categorized prompting strategy and a downstream consistency analysis to align LLM explanations with SLM decisions, addressing potential inconsistencies. Empirical results on four open-source datasets show that categorizing inconsistencies and applying filtering substantially improve alignment and yield meaningful feedback for refining the SLM. The framework offers a practical path toward latency-aware, interpretable hallucination detection and demonstrates how LLM-based explanations can inform iterative improvement of smaller detectors.
Abstract
Large language models (LLMs) are highly capable but face latency challenges in real-time applications, such as conducting online hallucination detection. To overcome this issue, we propose a novel framework that leverages a small language model (SLM) classifier for initial detection, followed by a LLM as constrained reasoner to generate detailed explanations for detected hallucinated content. This study optimizes the real-time interpretable hallucination detection by introducing effective prompting techniques that align LLM-generated explanations with SLM decisions. Empirical experiment results demonstrate its effectiveness, thereby enhancing the overall user experience.
