Controlled Automatic Task-Specific Synthetic Data Generation for Hallucination Detection
Yong Xie, Karan Aggarwal, Aitzaz Ahmad, Stephen Lau
TL;DR
The paper tackles the challenge of task-specific hallucinations by proposing a Generation-Selection pipeline that uses Hallucination Pattern Guidance (HPG) and Language Style Alignment (LSA) to create high-quality synthetic datasets for training post-hoc detectors. It further augments robustness with a data mixture strategy across multiple LLM generators, enabling cross-generator, cross-pattern, and cross-task generalization. Empirical evaluation on OpenDialKG, ReDial, and SalesBot shows detectors trained on synthetic data outperform in-context learning detectors by a substantial margin and maintain robust generalization across generators and tasks. While promising, the approach relies on human-curated hallucination patterns and balanced data, suggesting opportunities for improving pattern discovery and distribution-aware mixing in future work.
Abstract
We present a novel approach to automatically generate non-trivial task-specific synthetic datasets for hallucination detection. Our approach features a two-step generation-selection pipeline, using hallucination pattern guidance and a language style alignment during generation. Hallucination pattern guidance leverages the most important task-specific hallucination patterns while language style alignment aligns the style of the synthetic dataset with benchmark text. To obtain robust supervised detectors from synthetic datasets, we also adopt a data mixture strategy to improve performance robustness and generalization. Our results on three datasets show that our generated hallucination text is more closely aligned with non-hallucinated text versus baselines, to train hallucination detectors with better generalization. Our hallucination detectors trained on synthetic datasets outperform in-context-learning (ICL)-based detectors by a large margin of 32%. Our extensive experiments confirm the benefits of our approach with cross-task and cross-generator generalization. Our data-mixture-based training further improves the generalization and robustness of hallucination detection.
