Table of Contents
Fetching ...

MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training

Xinxin You, Xien Liu, Qixin Sun, Huan Zhang, Kaiyin Zhou, Shaohui Liu, GuoPing Hu, ShiJin Wang, Si Liu, Ji Wu

TL;DR

This work tackles inconsistent hallucinations in large language models by introducing MIH-TCCT, a framework that cyclically generates event-based text and equivalent code to transfer the logical rigor of code to natural language. The method comprises event-text filtering, text-code cyclic training, and a dynamic code-quality assessment that leverages code execution to align textual and coding representations. Across three base models and two NLP tasks, MIH-TCCT significantly reduces inconsistent hallucinations while preserving or improving task performance, demonstrating strong generalizability without task-specific down-stream adaptations. The approach provides a new pathway for building more coherent and trustworthy LLM systems by bridging textual and coding modalities.

Abstract

Recent methodologies utilizing synthetic datasets have aimed to address inconsistent hallucinations in large language models (LLMs); however,these approaches are primarily tailored to specific tasks, limiting their generalizability. Inspired by the strong performance of code-trained models in logic-intensive domains, we propose a novel framework that leverages event-based text to generate corresponding code and employs cyclic training to transfer the logical consistency of code to natural language effectively. Our method significantly reduces inconsistent hallucinations across three leading LLMs and two categories of natural language tasks while maintaining overall performance. This framework effectively alleviates hallucinations without necessitating adaptation to downstream tasks, demonstrating generality and providing new perspectives to tackle the challenge of inconsistent hallucinations.

MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training

TL;DR

This work tackles inconsistent hallucinations in large language models by introducing MIH-TCCT, a framework that cyclically generates event-based text and equivalent code to transfer the logical rigor of code to natural language. The method comprises event-text filtering, text-code cyclic training, and a dynamic code-quality assessment that leverages code execution to align textual and coding representations. Across three base models and two NLP tasks, MIH-TCCT significantly reduces inconsistent hallucinations while preserving or improving task performance, demonstrating strong generalizability without task-specific down-stream adaptations. The approach provides a new pathway for building more coherent and trustworthy LLM systems by bridging textual and coding modalities.

Abstract

Recent methodologies utilizing synthetic datasets have aimed to address inconsistent hallucinations in large language models (LLMs); however,these approaches are primarily tailored to specific tasks, limiting their generalizability. Inspired by the strong performance of code-trained models in logic-intensive domains, we propose a novel framework that leverages event-based text to generate corresponding code and employs cyclic training to transfer the logical consistency of code to natural language effectively. Our method significantly reduces inconsistent hallucinations across three leading LLMs and two categories of natural language tasks while maintaining overall performance. This framework effectively alleviates hallucinations without necessitating adaptation to downstream tasks, demonstrating generality and providing new perspectives to tackle the challenge of inconsistent hallucinations.

Paper Structure

This paper contains 27 sections, 11 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Two types of inconsistent hallucinations occurred in LLM responses.
  • Figure 2: The Correspondence Between Event-Driven Text and Programming Language.
  • Figure 3: An overview of our proposed framework begins with filtering event-based text, followed by cyclic generation training of event-based text and parallel code based on their transformation relationship. In each iteration, a quality evaluation module is employed to assess the quality of the generated parallel code until multiple iterations result in improved capabilities in parallel corpus generation, ultimately achieving alignment between the two corpora.
  • Figure 4: An example of an original text and the generated code.
  • Figure 5: The ablation experiment results based on the Llama 3.1-Instruct are presented for two types of tasks, displaying the inconsistency hallucination evaluation metrics across different tasks.
  • ...and 4 more figures