MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You, Xien Liu, Qixin Sun, Huan Zhang, Kaiyin Zhou, Shaohui Liu, GuoPing Hu, ShiJin Wang, Si Liu, Ji Wu
TL;DR
This work tackles inconsistent hallucinations in large language models by introducing MIH-TCCT, a framework that cyclically generates event-based text and equivalent code to transfer the logical rigor of code to natural language. The method comprises event-text filtering, text-code cyclic training, and a dynamic code-quality assessment that leverages code execution to align textual and coding representations. Across three base models and two NLP tasks, MIH-TCCT significantly reduces inconsistent hallucinations while preserving or improving task performance, demonstrating strong generalizability without task-specific down-stream adaptations. The approach provides a new pathway for building more coherent and trustworthy LLM systems by bridging textual and coding modalities.
Abstract
Recent methodologies utilizing synthetic datasets have aimed to address inconsistent hallucinations in large language models (LLMs); however,these approaches are primarily tailored to specific tasks, limiting their generalizability. Inspired by the strong performance of code-trained models in logic-intensive domains, we propose a novel framework that leverages event-based text to generate corresponding code and employs cyclic training to transfer the logical consistency of code to natural language effectively. Our method significantly reduces inconsistent hallucinations across three leading LLMs and two categories of natural language tasks while maintaining overall performance. This framework effectively alleviates hallucinations without necessitating adaptation to downstream tasks, demonstrating generality and providing new perspectives to tackle the challenge of inconsistent hallucinations.
