Developing a Reliable, Fast, General-Purpose Hallucination Detection and Mitigation Service
Song Wang, Xun Wang, Jie Mei, Yujia Xie, Sean Muarray, Zhang Li, Lingfeng Wu, Si-Qing Chen, Wayne Xiong
TL;DR
The paper addresses hallucination in LLM outputs and presents a production-ready service for detection and mitigation. It uses a three-component pipeline—multi-source detection, iterative rewriting, and multi-source verification—targeting intrinsic hallucinations with low latency and cost. The detection stack combines NER, NLI, and span-based (SBD) signals via a Gradient Boosting ensemble, plus GPT-4-based labeling for data generation and evaluation, while the rewriting module uses GPT-4 with two prompts to balance quality and efficiency. Extensive experiments on offline benchmarks and live production traffic demonstrate effective hallucination detection and mitigation, with production viability demonstrated alongside clear trade-offs in latency and cost.
Abstract
Hallucination, a phenomenon where large language models (LLMs) produce output that is factually incorrect or unrelated to the input, is a major challenge for LLM applications that require accuracy and dependability. In this paper, we introduce a reliable and high-speed production system aimed at detecting and rectifying the hallucination issue within LLMs. Our system encompasses named entity recognition (NER), natural language inference (NLI), span-based detection (SBD), and an intricate decision tree-based process to reliably detect a wide range of hallucinations in LLM responses. Furthermore, we have crafted a rewriting mechanism that maintains an optimal mix of precision, response time, and cost-effectiveness. We detail the core elements of our framework and underscore the paramount challenges tied to response time, availability, and performance metrics, which are crucial for real-world deployment of these technologies. Our extensive evaluation, utilizing offline data and live production traffic, confirms the efficacy of our proposed framework and service.
